## Repository-to-Tutorial Automation Project

Imagine turning any code repository into a polished, professional tutorial in a matter of minutes. This project uses the power of AI, using Google Gemini, to automate the transformation of software repositories into detailed, user-friendly guides.

By analyzing code and file structures, it generates high-quality, interactive tutorials for developers and users to get started. This could be for onboarding or a quick start document.

Effortless documentation, endless possibilities.

### Introduction to the Gemini Long Context API

The Gemini Long Context API is a cutting-edge tool in natural language processing, specifically designed to handle extensive and complex datasets. With its focus on delivering high-quality generative and analytical capabilities, it enables developers and businesses to work effectively with large volumes of content. These APIs can process and retain meaning across long-form text inputs, making them suited for multi-file analysis or document-heavy tasks. They can also generate insights tailored to the provided content without sacrificing the quality of output. 

#### This Project does what?

This project showcases the API's capability to transform a repository of code and documentation into a comprehensive, human-readable tutorial, emphasizing Gemini's utility in bridging technical content and user understanding. Its ability to process multiple files, summarize content, and respond accurately highlights the transformative potential of the Gemini Long Context API in real-world applications.  

## Let's Start 🚀

#### Import our Python Packages

In [8]:
import os
import subprocess
import google.generativeai as genai
import time
from fpdf import FPDF
import asyncio

#### Authenticate with Gemini API and clone the code repo

In [2]:
genai.configure(api_key="API key") 

def clone_repo(repo_url, local_path):
    """Clones a GitHub repository to a local directory."""
    if not os.path.exists(local_path):
        subprocess.run(["git", "clone", repo_url, local_path], check=True)
        print(f"Cloned repository to {local_path}")
    else:
        print(f"Repository already cloned at {local_path}")

#### Find and Filter Files

- Once the repository is cloned, the script scans through all its files in the repo and picks out the ones we care about, like .js, .jsx, .py, .md, or .json.

In [9]:
def get_file_list(repo_path):
    """Retrieves a list of all files in the repository directory."""
    file_list = []
    for root, _, files in os.walk(repo_path):
        for file in files:
            file_list.append(os.path.join(root, file))
    return file_list

def filter_code_files(file_list, extensions=(".js", ".json", ".md", ".jsx", ".ts", ".py", ".txt", ".rst", ".yml", ".yaml")):
    """Filters the file list for specific code file extensions."""
    return [file for file in file_list if any(file.endswith(ext) for ext in extensions)]

### Handling rate limits, processing files for tutorial generation
While I was working on it i got a resource exhaution error. So i didnt to add this to handle that.

we can also see two functions here:

- `process_files_with_rate_limit` handles the preparation of files for tutorial generation. 
- `generate_tutorial_with_rate_limit()` handles the generation of the tutorial using the processed file data 

Toward the end of the code, you will also notice that there is a prompt there which is the instruction I have given to Google Gemini to generate the desired output-the tutorial. 🥳

```
generation_config = {
        'temperature': 0.3,
        'max_output_tokens': 16000
    }
```

- temperature- Controls the randomness of the output. thus a low value like 0.3 makes the model more focused which is just what i need for my tutorials. However a value like 0.7 or 0.9 makes it more creative.
- max_output_tokens: Specifies the maximum number of tokens (words or word fragments) the response can contain. 16000 tokens ensure the AI generates a lengthy, detailed tutorial while staying within limits.

In [10]:
# Rate-limited file content processing
def process_files_with_rate_limit(files, max_files=10, max_chars_per_file=5000):
    processed_files = []
    
    for file in files[:max_files]:
        try:
            with open(file, 'r', encoding='utf-8', errors='ignore') as f:
                content = f.read(max_chars_per_file)
                
                file_entry = f"### File: {os.path.basename(file)}\n```\n{content}\n```\n"
                processed_files.append(file_entry)
                
        except Exception as e:
            print(f"Error processing {file}: {e}")
    
    return processed_files

def generate_tutorial_with_rate_limit(processed_files, max_retries=3):
    model = genai.GenerativeModel('gemini-pro')
    
    full_content = "# Repository Tutorial\n\n" + "\n".join(processed_files)
    
    generation_config = {
        'temperature': 0.3,
        'max_output_tokens': 16000
    }
    
    for attempt in range(max_retries):
        try:
            prompt = f"""
            You are an expert software developer creating a comprehensive tutorial.
            Generate a detailed tutorial based on the following repository contents.
            Focus on explaining the project structure, key components, and how to set up and use the project.

            Repository Contents:
            {full_content}

            Tutorial Requirements:
            - Provide a clear project overview
            - Explain how to set up the project
            - Describe key files and their purposes
            - Give step-by-step instructions
            - Include any necessary configuration or dependencies
            """
            
            response = model.generate_content(
                prompt, 
                generation_config=generation_config
            )
            
            return response.text
        
        except Exception as e:
            print(f"API call failed (Attempt {attempt + 1}/{max_retries}): {e}")
            
            time.sleep(2 ** attempt)
    
    return "Unable to generate tutorial due to persistent API issues."

#### Generating the PDF 

here we make use of all the functions and save the AI-generated tutorial into a shareable, polished PDF format.

In [11]:
def save_tutorial_as_pdf(tutorial_text, output_file="repository_tutorial.pdf"):
    try:
        pdf = FPDF()
        pdf.add_page()
        pdf.set_font("Arial", size=12)
        
        chunk_size = 1000
        chunks = [tutorial_text[i:i+chunk_size] for i in range(0, len(tutorial_text), chunk_size)]
        
        for chunk in chunks:
            pdf.multi_cell(0, 10, chunk)
        
        pdf.output(output_file)
        print(f"Tutorial saved as {output_file}")
    except Exception as e:
        print(f"Error creating PDF: {e}")

def generate_repository_tutorial(repo_url, local_path="./repo"):
    clone_repo(repo_url, local_path)
    
    all_files = get_file_list(local_path)
    code_files = filter_code_files(all_files)
    
    print(f"Found {len(code_files)} potential code files. Processing...")
    
    processed_files = process_files_with_rate_limit(code_files)
    
    tutorial = generate_tutorial_with_rate_limit(processed_files)
    
    save_tutorial_as_pdf(tutorial)
    
    return tutorial

#### Let's give it a try! 

- Give it a Repository URL

In [12]:
tutorial = generate_repository_tutorial("https://github.com/google-gemini/generative-ai-python.git")

print(tutorial)

save_tutorial_as_pdf(tutorial, output_file="Python_SDK_tutorial.pdf")

Repository already cloned at ./repo
Found 403 potential code files. Processing...
Tutorial saved as repository_tutorial.pdf
## Comprehensive Tutorial for the Generative AI Python SDK

### Project Overview

The Google AI Python SDK for the Gemini API enables Python developers to seamlessly interact with Gemini models, a suite of multimodal models developed by Google DeepMind. These models empower developers to reason across text, images, and code, unlocking a wide range of applications.

### Setting Up the Project

**1. Installation**

Install the SDK using pip:

```
pip install -U google-generativeai
```

**2. API Key**

Obtain an API key from Google AI Studio:

- Go to [Google AI Studio](https://aistudio.google.com/).
- Log in with your Google account.
- Create an API key.

**3. Configuration**

Configure the SDK with your API key:

```python
import google.generativeai as genai
import os

genai.configure(api_key=os.environ["GEMINI_API_KEY"])
```

### Key Files and Their Purposes

- **

In [13]:
tutorial = generate_repository_tutorial("https://github.com/google-gemini/gemini-api-quickstart")

print(tutorial)

save_tutorial_as_pdf(tutorial, output_file="Gemini_API_Quickstart_tutorial.pdf")

Repository already cloned at ./repo
Found 403 potential code files. Processing...
Tutorial saved as repository_tutorial.pdf
## Comprehensive Tutorial for the Google AI Python SDK for the Gemini API

### Project Overview

The Google AI Python SDK for the Gemini API provides a high-level interface for developers to interact with the Gemini API. Gemini is a multimodal AI model developed by Google DeepMind that can reason across text, images, and code.

### Setup

**1. Install the SDK**

```
pip install google-generativeai
```

**2. Configure API Key**

Set the `GEMINI_API_KEY` environment variable with your API key. You can obtain an API key from the [Google AI Studio](https://aistudio.google.com/).

### Key Files

**1. README.md**

Provides an overview of the SDK, usage examples, and documentation links.

**2. CONTRIBUTING.md**

Outlines the process for contributing to the SDK, including signing a Contributor License Agreement (CLA).

**3. RELEASE.md**

Lists the changes and new features

#### In-complete artcile? lets solve that

we will generate tutorials in parts, processing smaller sections of processed_files individually before we combine the results afterward for the complete article.

In [14]:
def generate_repository_tutorial(repo_url, local_path="./repo", chunk_size=5):
    clone_repo(repo_url, local_path)
    
    all_files = get_file_list(local_path)
    code_files = filter_code_files(all_files)
    
    print(f"Found {len(code_files)} potential code files. Processing...")
    
    chunks = [code_files[i:i + chunk_size] for i in range(0, len(code_files), chunk_size)]
    
    full_tutorial = []
    for i, chunk in enumerate(chunks):
        print(f"Processing chunk {i + 1}/{len(chunks)}...")
        
        processed_files = process_files_with_rate_limit(chunk)
        
        tutorial_part = generate_tutorial_with_rate_limit(processed_files)
        full_tutorial.append(tutorial_part)
    
    combined_tutorial = "\n\n".join(full_tutorial)
    
    save_tutorial_as_pdf(combined_tutorial)
    
    return combined_tutorial


In [15]:
tutorial = generate_repository_tutorial("https://github.com/google-gemini/generative-ai-python.git", chunk_size=5)

print(tutorial)

save_tutorial_as_pdf(tutorial, output_file="New_modified_tutorial.pdf")

Repository already cloned at ./repo
Found 403 potential code files. Processing...
Processing chunk 1/81...
Processing chunk 2/81...
Processing chunk 3/81...
Processing chunk 4/81...
Processing chunk 5/81...
Processing chunk 6/81...
Processing chunk 7/81...
Processing chunk 8/81...
Processing chunk 9/81...
Processing chunk 10/81...
Processing chunk 11/81...
Processing chunk 12/81...
Processing chunk 13/81...
Processing chunk 14/81...
Processing chunk 15/81...
Processing chunk 16/81...
Processing chunk 17/81...
Processing chunk 18/81...
Processing chunk 19/81...
Processing chunk 20/81...
Processing chunk 21/81...
Processing chunk 22/81...
Processing chunk 23/81...
Processing chunk 24/81...
Processing chunk 25/81...
Processing chunk 26/81...
Processing chunk 27/81...
Processing chunk 28/81...
Processing chunk 29/81...
Processing chunk 30/81...
Processing chunk 31/81...
Processing chunk 32/81...
Processing chunk 33/81...
Processing chunk 34/81...
Processing chunk 35/81...
Processing chunk 