This repo contains a set of reference designs for various ML topics. A few examples of the types of things you can learn here:
Browser Use is an open source library that allows language models to interact with browser windows and page contents
An Agent is a system that leverages an AI model to interact with its environment in order to achieve a user-defined objective. It combines reasoning, planning, and the execution of actions (often via external tools) to fulfill tasks.
- Overview
- Dummy Agent A bare bones agent with mocked tool calls
- Smol Agent A template for using the SmolAgent framework with a Gradio web interface
- Code Agents Use the SmolAgent framework to create agents that generate code to call tools and perform calculations.
- Duration Agent Generate code that uses authorized imports to add up the times listed in the prompt.
- Menu Agent Call a custom tool to generate a menu prompt and populate using the model's built in knowledge
- Playlist Agent Search the internet and generate a music playlist for a wedding
- Workout Agents Use the SmolAgent framework to perform fitness related planning tasks
- Strength Plan Agent An agent that considers how many reps you can do at a given weight and generates a strength training program
- Retrieval Agents Retrieve data from specialized systems using the SmolAgent framework.
- Basic Retrieval Agent Search the internet using Duck Duck Go and form a total body fitness plan
- NIST CSF Retrieval Agent Use semantic search (search by meaning) against specialized NIST Cyber Security Framework practices
- Multi-agents Multiple agents working together
- Park Planner Multi-Agent Search the internet for national parks and calculate travel time by cargo plane. One agent can search the internet and the other agent does planning and distance calculations
- Travel Agents Use LangGraph to work with natural language to work with flight reservations and look up company travel policies
- Zero-shot Agent Perform all of the steps at once without confirmation by the user
- Confirmation Agent Confirm with the user before every tool call
- Smart Confirm Agent Only confirm with the user before writing to the database
- Audio Agent Use OpenAI Agents library to process voice data with multi-agent pipelines.
Debug AI Agent Execution using the Trace feature
Leverage agents that use a large language model as the brain to direct tools that interact with the real world.
Various simple examples for getting started with different frameworks
- Terminology
- PyTorch
- TensorFlow
Various recipes for common feature engineering tasks.
- Pandas essentials
- Handle missing data
- Convert class labels to numbers
- Imbalanced classification
- Choose Fourier features
- Kaggle predict home price feature prep
Recipes for working with images
Increase the effectiveness of OCR by preprocessing images
Various examples that deal with predicting a value based on inputs
Various examples that deal with placing inputs into one or more categories
- Classify breast cancer diagnosis with PyTorch
- Image classification with PyTorch and Fashion MNIST
- Sentiment analysis in JavaScript using transformers
Various examples that deal with grouping data points by a similarity metric.
- Cluster seed types using K-Means and scikit-learn
- Cluster penguin species using DBSCAN and scikit-learn
Group items that are similar using only their attributes
Various examples that deal with time based data
Predict future values in data that varies over time
Various examples that deal with processing image data.
- Image classification with PyTorch and Fashion MNIST
- Image segmentation using the Meta Segment Anything Model and OpenCV
Intelligently select complex objects in images
Examples that interact with large language models with billions of parameters that are often training across many commercial grade GPUs for many millions of hours.
Direct a large language model to answer based only on context from documents
Form graphs to model decisions and loops with AI
- Call ChatGPT with a prompt template
- Use a Hugging Face inference API
- Use a query engine to access data in documents
Various tasks that deal with deploying AI systems
- Kaggle predict home prices, batch evaluate blind test data
- Run Llama 3 on an AWS EC2 instance
- Run LLM pipeline Python code in a container using Docker
- Serve index and RAG pipelines over HTTP using Flask
- Use the Model Context Protocol (MCP) with a custom LLM client and data access server
- Dotnet MCP Server with VS Code and Github Co-pilot
- Agents in Software Engineering: Survey, Landscape, and Vision
- Attention Is All You Need
- Let's build ChatGPT: from scratch, in code, spelled out, Andrej Karpathy
- Model Context Protocol
- ReAct: Synergizing Reasoning and Acting in Language Models
- SmolLM Github
- 12 Factor Agent
Some of the examples in this repo are meant to be run interactively using Jupyter-Lab or Jupiter-Notebooks. See https://jupyter.org/install
Examples that only have script files will have a README file with instructions.
To avoid conflicts with your local environment, create a virtual environment and run the notebook within this environment.
Then select the virtualenv kernel after launching Jupyter Lab with the command jupyter lab
For additional background see https://www.linkedin.com/pulse/how-use-virtual-environment-inside-jupyter-lab-sina-khoshgoftar
python -m venv .venv
.venv\Scripts\activate
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
python -m ipykernel install --user --name=virtualenv
Most advances in machine learning are happening on Linux targeting Nvidia GPUs with CUDA support. Some advanced models such as Llama 3 may not work well (or at all) on Windows machines.
Some examples are targeted at NodeJS. There are no specific versions of Node needed but you can always use nvm
to keep your environments tidy. See https://github.com/nvm-sh/nvm for more details.
Workbook examples that include LLMs models are more complex than other examples and require additional setup work.
Llama 3
- Download the model weights (requires an access request that is granted by Meta staff, may take 24 hours or more to be approved) https://huggingface.co/meta-llama/Meta-Llama-3-8B
- Model weights are GBs of data, store them in a drive with sufficient space
- Clone the model code https://github.com/meta-llama/llama3
- Change to the directory with the model code and pip install the model and dependencies
pip install -e .
Examples in this repo cover the following industry domain problems:
- Accounting
- Receipt processing
- Botany
- Group observations into n groups based on equal variance
- Customer Service
- Context aware chat bots
- Event Planning
- Generate music playlists
- Generate menus for specific occasions
- Calculate the total time needed for setup
- Fitness
- Generate a strength training program
- Games
- AI controlled NPCs
- Hospitality
- Sentiment analysis
- Medical
- Breast cancer diagnosis
- Real Estate
- Price prediction
- Retail
- Product image classification
- Technology
- Deploy machine learning models to production
- Compose workflows involving large language models (LLMs)
- Store and search for embedding data in vector stores
- Intelligently select complex objects in images
- Expand capabilities of large language models with custom tool calling
- Create AI agents that can interact with the real world
- Transportation
- Seasonal airline traffic prediction
- Search the internet for locations, calculate travel times to all destinations
- Manage flights
- Zoology
- Group observations based on data density