Skip to content

devanand343/content-machine-2026

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Full Content Machine 2026

An AI-powered, agentic content creation pipeline that transforms topic lists into fully-researched, SEO-optimized, human-readable articles β€” with Human-in-the-Loop (HITL) approval built in. Generates a 1,500-word SEO article in approximately 90 seconds (depending on LLM latency)


πŸ“‹ Table of Contents


Overview

Full Content Machine 2026 is a fully automated, multi-agent RAG (Retrieval-Augmented Generation) system built with LangGraph, LangChain, ChromaDB, and Streamlit. You upload a simple CSV/Excel content calendar, and the machine produces polished, SEO-optimized articles β€” one at a time, with a human review checkpoint at the outline stage.

It's designed for content teams, SEO agencies, and solo creators who want to scale output without sacrificing quality.


Key Features

  • πŸ“„ CSV/Excel Content Calendar β€” Bulk import topics & keywords
  • 🧠 Multi-Agent RAG Pipeline β€” Separate agents for query generation, retrieval, outlining, drafting, and editing
  • βœ… Human-in-the-Loop (HITL) β€” Review and approve or regenerate outlines before drafting begins
  • πŸ“š Knowledge Base Ingestion β€” Upload PDFs/TXT files into ChromaDB collections for contextual research
  • ✍️ AI Writing & Editing β€” GPT-powered drafting + editorial refinement pass
  • πŸ“¦ HTML Export β€” Articles exported as clean, publish-ready HTML files
  • πŸ”„ LangGraph Orchestration β€” Stateful, inspectable graph-based agent workflow
  • πŸŽ›οΈ Streamlit UI β€” Clean, interactive web interface β€” no code required to run

Architecture

Content Calendar (CSV/Excel)
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Query Generator  β”‚  β†’ Generates search queries from topic + keyword
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Retriever      β”‚  β†’ Fetches context from ChromaDB vector store
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Outline Generator β”‚  β†’ Creates structured article outline (GPT-4o)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    [HITL Review]   ← Human approves or rejects outline
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Article Writer  β”‚  β†’ Drafts the full article (async, GPT-4o)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Article Editor  β”‚  β†’ Refines tone, clarity, SEO
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    HTML Exporter  β”‚  β†’ Saves final article to /exports
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Project Structure

Content Machine/
β”œβ”€β”€ app.py                  # Main Streamlit application entry point
β”œβ”€β”€ requirements.txt        # Python dependencies (pinned versions)
β”œβ”€β”€ .env                    # API keys and secrets (NOT committed to git)
β”œβ”€β”€ .gitignore              # Files excluded from version control
β”œβ”€β”€ README.md               # This file
β”‚
β”œβ”€β”€ src/                    # Core application modules
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ config.py           # App-wide configuration & ChromaDB collection names
β”‚   β”œβ”€β”€ db.py               # ChromaDB initialization & collection management
β”‚   β”œβ”€β”€ ingestion.py        # PDF/TXT ingestion into vector store
β”‚   β”œβ”€β”€ query_generator.py  # Converts CSV rows into search queries
β”‚   β”œβ”€β”€ retriever.py        # RAG retrieval from ChromaDB
β”‚   β”œβ”€β”€ graph.py            # LangGraph pipeline definition & state management
β”‚   β”œβ”€β”€ writer.py           # AI article drafting agent
β”‚   β”œβ”€β”€ editor.py           # AI article editing/refinement agent
β”‚   └── exporter.py         # HTML export utilities
β”‚
β”œβ”€β”€ chroma_db/              # Local ChromaDB vector store (auto-generated, git-ignored)
└── exports/                # Generated HTML articles (git-ignored)

Tech Stack

Layer Technology
UI Streamlit
Orchestration LangGraph
LLM Framework LangChain
LLM Provider OpenAI GPT-4o
Vector Database ChromaDB
Embeddings OpenAI text-embedding-ada-002
Data Processing Pandas
PDF Parsing pypdf, pypdfium2
Async Runtime asyncio + nest_asyncio
Language Python 3.11+

Prerequisites

  • Python 3.11 or higher
  • An OpenAI API key (get one here)
  • pip or a virtual environment manager (venv, conda, etc.)
  • Git (for version control)

Installation

1. Clone the Repository

git clone https://github.com/YOUR_USERNAME/content-machine-2026.git
cd content-machine-2026

2. Create a Virtual Environment

python -m venv .myenv
source .myenv/bin/activate        # macOS / Linux
# .myenv\Scripts\activate         # Windows

3. Install Dependencies

pip install -r requirements.txt

Configuration

Create a .env file in the project root (this is git-ignored for security):

cp .env.example .env   # if example exists, otherwise create manually

Add your secrets to .env:

OPENAI_API_KEY=sk-...your-key-here...
# Add any other API keys or config values here

⚠️ Never commit your .env file. It is already listed in .gitignore.


Running the App

# Make sure your virtual environment is active
source .myenv/bin/activate

# Launch the Streamlit app
streamlit run app.py

The app will open at http://localhost:8501 in your browser.


How to Use

Step 1 β€” (Optional) Ingest a Knowledge Base

Use the sidebar to upload PDF or TXT files into a ChromaDB collection. This gives the AI contextual research material to draw from when writing articles.

Step 2 β€” Upload Your Content Calendar

Upload a CSV or Excel file with at least two columns:

  • Topic β€” The article title or subject
  • Keyword β€” The primary SEO keyword to target

Example CSV:

Topic,Keyword,Search Intent
How to Start a Podcast,start a podcast,Informational
Best Running Shoes 2026,best running shoes,Commercial

Step 3 β€” Map Columns

Select which columns correspond to Topic and Keyword using the dropdowns.

Step 4 β€” Run the Machine

Click πŸš€ Run Content Machine. For each topic, the pipeline will:

  1. Generate research queries
  2. Retrieve relevant context from the vector store
  3. Generate an article outline β€” then pause for your review

Step 5 β€” Review the Outline (HITL)

  • βœ… Approve β€” Continue to drafting, editing, and export
  • ❌ Reject & Regenerate β€” Add feedback and get a revised outline

Step 6 β€” Collect Exported Articles

Finished articles are saved as .html files in the /exports folder.


Pipeline Flow

Upload Calendar β†’ Query Gen β†’ RAG Retrieval β†’ Outline
                                                  ↓
                                          [Human Review]
                                         ↙           β†˜
                                    Approve         Reject + Feedback
                                       ↓                  ↓
                                  Draft Article     Regenerate Outline
                                       ↓
                                  Edit Article
                                       ↓
                                  Export HTML
                                       ↓
                               Next Topic in Queue

Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repo
  2. Create your feature branch: git checkout -b feature/your-feature
  3. Commit your changes: git commit -m 'Add some feature'
  4. Push to the branch: git push origin feature/your-feature
  5. Open a Pull Request

License

This project is licensed under the MIT License. See the LICENSE file for details.


Built with ❀️ using LangGraph, LangChain, OpenAI, ChromaDB & Streamlit.

About

AI-powered agentic content creation pipeline with LangGraph, RAG & Streamlit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages