Iptic Memex is a Python program that offers a straight-forward CLI interface for interacting with LLM providers through their APIs. Input can be piped in from the command line or entered interactively through 'ask' and 'chat' modes. Chat mode features the ability to save conversations in a human-readable conversation format with a configurable extension for use with external applications such as Obsidian.
The name is a reference to the Memex, a device described by Vannevar Bush in his 1945 essay "As We May Think" which he envisioned a device that would compress and store all of their knowledge. https://en.wikipedia.org/wiki/Memex
-
Multiple Interaction Modes:
- CLI mode for quick questions or scripting
- Chat mode for extended conversations
- Ask mode for single questions
- Completion mode for processing files or stdin
-
Context Management:
- Load text files into context in chat and ask modes
- Load PDF/XLSX/DOCX files into context in chat mode
- Fetch and include content from the web
- Search the web and load results into context
- Easily add multi-line content
- Select and import parts of Python files
- Add a project context for encapsulating other contexts
-
Provider Flexibility:
- Supports multiple LLM providers. Currently:
- OpenAI
- Anthropic
- Google Gemini
- OpenRouter
- Perplexity
- Groq
- Mistral
- DeepSeek
- Cohere
- Fireworks AI
- Together AI
- Llama.cpp via API
- OpenAI compatibile providers can be added through configs, no code changes needed
- Easy configuration of providers and models through config files
- Switch between providers and models on the fly
- Providers can be aliased in the config file for per provider or model settings
- Supports multiple LLM providers. Currently:
-
Conversation Handling:
- Save and load conversations in human-readable formats
- Export conversations to various formats (markdown, txt, pdf)
- Context management for optimizing token usage
- Token usage tracking and management where applicable
- Easily save code blocks from responses to files
-
Enhanced User Experience:
- Streaming support for real-time responses
- Syntax highlighting for code blocks in chat
- Token usage tracking and context management
- Tab completion for file paths, commands, and settings
- Run code blocks from responses and capture the output
- Run your own shell commands and capture the output
-
Extensibility:
- Modular action system for easy feature additions
- Custom context handlers for various input types
The program is still in development and may have bugs or issues. Please report any problems you encounter.
- Clone the repository
git clone https://github.com/acasto/iptic-memex.git
- Install the dependencies
pip install -r requirements.txt
- Then run the program with
python main.py
- Configuration can be done through
config.ini
in the project directory~/.config/iptic-memex/config.ini
in the user directory- via custom .ini file with
-c
or--conf
flag
- Model configuration can be done through
models.ini
in the project directory~/.config/iptic-memex/models.ini
in the user directory
- API key can be set in the config file as
api_key
or via environment variables. (e.g.OPENAI_API_KEY
) - Usage is well documented with click and can be accessed with
python main.py --help
or<subcommand> --help
python main.py --help
: Display general helppython main.py <subcommand> --help
: Show help for a specific subcommandpython main.py chat
: Enter chat modepython main.py ask
: Enter ask modepython main.py chat -f <filename>
: Chat about a specific file
While in chat mode, you can use the following commands:
help
: Display a list of available commandsquit
orexit
: Exit the chat modeload project
: Load a project into the contextload file
: Load a file into the context- 'load pdf': Load a PDF file into the context
- 'load doc': Load a DOCX file into the context
- 'load sheet': Load an XLSX file into the context
load code
: Load code snippets into the contextload multiline
: Load multiple lines of text into the contextload web
: Load content from a web page into the contextload soup
: Fetch content from a web page using BeautifulSoupload search
: Perform a web search and load resultsclear context
: Clear a specific item from the contextclear chat
: Reset the entire conversation stateclear last [n]
: Remove the last n messages from the chat historyclear first [n]
: Remove the first n messages from the chat historyclear
: Clear the screenreprint
: Reprint the entire conversationshow settings
: Display all current settingsshow models
: List all available modelsshow messages
: Display all messages in the current chatshow usage
: Show token usage statisticsset option
: Modify a specific option or settingsave chat
: Save the current chat sessionsave last
: Save only the last message of the chatsave full
: Save the full conversation including contextsave code
: Extract and save code blocks from the conversationrun code
: Extract and run code blocks from the conversationrun command
: Run a shell command and optionally capture the outputload chat
: Load a previously saved chat sessionlist chats
: Display a list of all saved chat sessionsexport chat
: Export the current chat in a specified format
These commands provide extensive control over the chat environment, allowing you to manage context, manipulate the conversation history, adjust settings, and interact with external resources seamlessly.
config.ini
: Main configuration filemodels.ini
: Detailed model information and settings- User-specific configurations can be added in
~/.config/iptic-memex/config.ini
and~/.config/iptic-memex/models.ini
To add a new OpenAI compatible provider, just add a section to config.ini in the following format along with any other settings you may want to override.
[provider_name]
alias = OpenAI
base_url = <the provider's base URL>
Then you just need to add the models to models.ini like so:
[model short name]
provider = <the provider you setup>
model_name = <the full official model name>
context_size = 4096
response_label = "> My Model: "
You can add additional parameters to the body of the quest by using the extra_body
setting with a provider in config.ini or a model in models.ini.
Examples:
Set a preferred order of provider when using OpenRouter:
extra_body = {provider: { order: [Together, Lepton] } }
Turn on prompt caching and set stop tokens via llama.cpp API:
extra_body = {cache_prompt:true, stop:[<|im_end|>,<|im_start|>,<end_of_turn>,<|end|>]}
One of the more useful ways to use this program is to chat or ask questions about a file or URL. This can be done by
supplying one or more --file
(-f
) to the chat
or ask
subcommands. The file(s) will be loaded into the context through the prompt and available for you to ask questions about. Web context has been moved to the chat mode and can be accessed with the load web
, load soup
, and load search
commands.
For example:
python main.py chat -f problem_code.py
python main.py ask -f code.txt -f logfile.txt
- From within chat mode:
load file
,load web
, orload soup
Note: load web
uses the trafilatura library to retrieve a more simplified version of the web page, while load soup
uses BeautifulSoup to scrape the raw HTML. Evenually these will probably be merged into a single more robust command.
The load search
action is currently based on the Breave Search API summarization endpoint and gets added to the context the same as others (e.g. chatting with a file). The API key is set the same as other providers. Support will eventually be added for other search providers and configurations.
The load multiline
command allows you to add multiple lines of text to the context. This can be useful for adding code snippets, error messages, or other multi-line content.
Now that LLMs are getting better at producing functional code, the ability to save a code block instead of just copying it out is useful. The save code
command will extract code blocks from the most recent assistant reponse and provide a file save dialog. (save code <n>
can be used to parse the last-n responses). If multiple code blocks are present you will be presented with a choice of which to save.
The run code
command will extract code blocks from the most recent assistant response and run them in the current Python environment. It currently supports Python or Bash code blocks and will ask for confirmation before running. (run code <n>
can be used to parse the last-n responses). If multiple code blocks are present you will be presented with a choice of which to run.
After running the command you will have the option to capture the output to a multiline context to feed back to the model for iterative troubleshooting.
The run command
command lets you run shell commands from within the chat mode and capture the output into the context for use in the conversaton. This can be useful for running scripts or referencing system information.
The project context (load project
) causes the other contexts (e.g. file, web, multiline, etc.) to be wrapped in a common project context tags with a project name and project notes. You can add these other contexts from the load project
dialog.
- Added models/providers to models.ini
- Added trailing slash multiline to main chat
- Added wildcard support to 'load file'
- Added timeout setting for OpenAI provider
- Added a 'load raw' context for unwrapped context
- Added 'load sheet', 'load doc', and 'load pdf'
- Fixed minor bugs
- Added the
run command
command to run shell commands and capture the output - Added a provider class for Cohere
- Added entires in config.ini and models.ini to support Perplexity, Groq, Mistral, DeepSeek, and Cohere
- Adjusted the OpenAI provider so that stream usage can be disabled at the provider level (was breaking Mistral)
- Added the
run code
command with the ability to capture the output for iterative troubleshooting
- Minor bug fixes and improvements in error handling
- Updated README with new features and examples
- Implemented a more extensible architecture with a revamped provider system
- Introduced modular actions and contexts for easier functionality extension
- Enhanced support for multiple LLM providers (OpenAI, Anthropic, Google)
- Improved configuration management with separate model configurations
- Refactored code structure for better maintainability and extensibility
- Removed the URL command line arguments in favor of more robust context management in chat mode
- Added support for Anthropic and Google Gemini models
- Removed OpenRouter support temporarily
- Added models.ini for storing detailed model information
- Changed how model information is loaded and displayed
- Added ability to track token usage in chat mode with tiktoken
- New
context_window
option in config.ini for use in calculating remaining tokens - Added
show tokens
command in chat mode to show current token count - When loading context into chat mode the token count will be displayed before first request is sent
- A notice will display when max_tokens exceeds the estimated remaining tokens
- New
- Added ability to scrape text from a URL to be added into context for both chat and ask modes
- Scraped text can be filtered by ID or Class
- Added ability to accept multiple '-f' options for all modes
This project is licensed under the terms of the MIT license.