Access file as a database with interactive SQL query experience.
pip install -r requirements.txt
# Optional
cp example.env .env
# Fill-in your keys
# ...
streamlit run app.py
https://raw.githubusercontent.com/daviddwlee84/DuckDB_Chat/main/demo/demo.csv
chainlit run --watch chainlit_app.py
chainlit run --watch chainlit_app_pandasai.py
- Add demo of DuckDB with Jupyter Notebook
- Able to use customized table alias
- Implement with more elegant way => register table name
- Support multiple file loading (same extension and schema)
- Able to use DuckDB extensions - Extensions - DuckDB
- Add row limit for potential large file
-
<= 0
as no limit - Add
df.head()
at initial state - Add LIMIT at the end of SQL query
- Use more elegant way...?
-
- Support no header CSV (or see what will happened)
- Support DuckDB, SQLite, ... (Connecting to a database — Python documentation)
- Able to plot (maybe indicate by some prefix like Jupyter magics)
- Plot button (for each DataFrame or at the end) => this will make code very mess
- tvst/st-execbox
- Running Jupyter cells inside streamlit? - 🎈 Using Streamlit - Streamlit
- Chart visualization — pandas 2.1.4 documentation
- Retrieve file from URL (DuckDB — Python documentation)
- Maybe user can ignore "FROM" clause, which by default indicate
latest generated tableoriginal file- Think of option between latest generated table or the original file
- Option to print statistics (e.g. running time, rows, columns, ...)
-
Add clear page button (clear the chat history but keep the file pointer)=> Delete file or upload new one to clear history - Add SQL hint and external resources on page
- Add cheat sheet image
- Add option of auto
SELECT * FROM table
when the file loaded => Add session initial options- Also print some memory information
- Deploy this repository to Streamlit
- Support more than SQL command (i.e. DuckDB statement like CREATE TABLE, DESCRIBE, SHOW TABLES, etc.)
- Support DuckDB extensions
- Support web file path / API e.g. https://api.github.com/search/repositories?q=jupyter&sort=stars&order=desc
- Remove the file size limit of the file uploader
- Add natural language query support
- Can refer to this neural-maze/talking_with_hn: The full experience of chatting with your favourite news website.
- Inference LLM using API Key
- Streaming user experience
- Making agent that can generate DuckDB SQL query
- Making agent that can summary table
- See how to change the main page name (i.e. app)
without re-deploy Streamlit app=> Currently, not possible - Manage OpenAI keys in global settings
- Maybe move duplicate code block into one place => refactor
- Somehow failed to load from
.env
when disabling"key"
inst.text_input
- Solve DuckDB
read_parquet()
require filename (instead of file object) issue => Currently usingpd.read_parquet
+duckdb.from_df
- Support more file type (e.g. Excel, ...)
- Pipe (
|
) in SQL => multiple SQL description, and apply one after the other. - Support PRQL
- More Demo
- NL to SQL
- DBQA
- Able to construct SQLDatabase from uploaded file
- LangChain
- LlamaIndex
- NLSQLTableQueryEngine
- SQLTableRetrieverQueryEngine + ObjectIndex (retriever)
- Add Azure embed model
- Option for metadata
- Cost analysis
- Streaming
- PandasAI: gventuri/pandas-ai: PandasAI is the Python library that integrates Gen AI into pandas, making data analysis conversational
- Show plot image properly
- Count Token: Large language models (LLMs) - PandasAI
- Multiple files: SmartDatalake
- Agent: Clarification questions
Chainlit
- Try Chainlit
- Format user message as SQL code
- Support more file types (e.g. parquet)
- New table as attachment
- Manually set default table name
LangChain x DuckDB
SQLite
Deploy
- Deploy your app - Streamlit Docs
- App dependencies - Streamlit Docs
- Configuration - Streamlit Docs
- Secrets management - Streamlit Docs
- Copy
.env
settings to the Streamlit App Settings > Secrets
- Copy
LangChain x Streamlit
- SQL | 🦜️🔗 Langchain
- Case 1: Text-to-SQL query
- Case 2: Text-to-SQL query and execution
- Case 3: SQL agents
- SQL Database | 🦜️🔗 Langchain
- langchain.utilities.sql_database.SQLDatabase — 🦜🔗 LangChain 0.0.339rc1
- feat: parquet file support for SQL agent · Issue #2002 · langchain-ai/langchain (this guy use parquet with duckdb => convert to SQLite)
- CSV | 🦜️🔗 Langchain
- sugarforever/LangChain-SQL-Chain
- LLMs and SQL