# Chat con código local (parte 1, chunking y embedding)

![rag-codigo-local.jpg](../rag-codigo-local.jpg)

## 1. Examinar carpetas y archivos de código

Para este ejemplo se tomarán en cuenta sólo archivos de terraform (`.tf .tfvars`) y relacionados como `.sh` o notas `.md`


In [None]:
# Algunos requerimientos
# !pip install chromadb
# !pip install langchain
# !pip install langchain_community
# !pip install langchain-text-splitters
# !pip install openai

In [None]:
import os

def get_first_summary_from_contents(content, 
                                    summary_strings=['# summary:', '# resumen:']):
    """Extracts summary text from comments of style:
    # Summary: Creates a simple Application Load Balancer and an ASG
    Returns: Creates a simple Application Load Balancer and an ASG
    """
    for line in content.split('\n'):
        lline = line.lower().strip()
        if any([lline.startswith(s) for s in summary_strings]):
            # to remove the matching comment string
            for s in summary_strings:
                if lline.startswith(s):
                    return line[len(s):].strip()
        return ''


def load_snippets(base_path, extensions=['.tf', '.tfvars', '.sh', '.md']):
    snippets = []
    for root, _, files in os.walk(base_path):
        for file in files:
            if any([file.endswith(ext) for ext in extensions]):
                full_path = os.path.join(root, file)
                with open(full_path, "r", encoding="utf-8") as f:
                    content = f.read()
                    file_summary = get_first_summary_from_contents(content)
                    snippets.append({
                        "content": content,
                        "metadata": {
                            "filename": file,
                            "filepath": full_path,
                            "filesize": len(content),
                            "project": os.path.basename(root),
                            'summary': file_summary
                        }
                    })
    return snippets


CODE_PATH = '../data/terraform-examples/aws/'


snippets = load_snippets(CODE_PATH)
snippets

## 2. Creando chunks de código usando una estrategia híbrida

Usando una librería para detectar bloques de código se puede dividir los archivos de código en pedazos o *chunks*. Podría servir para mantener significado semántico a cada pedazo.


In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=550,
    chunk_overlap=50,
    separators=["\nresource", "\nmodule", "\nvariable", "\noutput", "\n"]
)

def chunk_snippets(snippets):
    chunked_docs = []
    for snippet in snippets:
        chunks = splitter.split_text(snippet["content"])
        for chunk in chunks:
            # getting some metadata for each chunk
            metadata = snippet["metadata"].copy()
            chunk_summary = get_first_summary_from_contents(
                chunk,
                ['# summary:', '# ', '// ', '# resumen:'])
            chunk_summary += f"{metadata['summary']} > {chunk_summary}"
            
            chunked_docs.append({
                "content": chunk,
                "metadata": metadata
            })
    return chunked_docs

chunk_snippets = chunk_snippets(snippets)
chunk_snippets

## 3. Creando embeddings

Usando modelos como **OpenAIEmbeddings** o **HuggingFaceEmbeddings** se pueden crear embeddings de mayor relevancia. Para este ejemplo se usar[a LangChain.

In [37]:
import shutil
#from langchain.embeddings import OpenAIEmbeddings
from langchain.embeddings import HuggingFaceEmbeddings
#from langchain.vectorstores import FAISS
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document


def create_chroma_vectorstore(docs, persist_directory="/tmp/chatlocal-chroma-db", replace=True):
    """Create the chrome vector store using the created docs"""
    print(persist_directory, replace, os.path.exists(persist_directory))
    if replace is True and os.path.exists(persist_directory):
        print(f'deleting {persist_directory}')
        shutil.rmtree(persist_directory)
    documents = [
        Document(page_content=doc["content"], metadata=doc["metadata"])
        for doc in docs
    ]
    embeddings = HuggingFaceEmbeddings()
    vectorstore = Chroma.from_documents(
        documents,
        embedding=embeddings,
        persist_directory=persist_directory
    )
    vectorstore.persist()  # Save to disk
    return vectorstore

vectorstore = create_chroma_vectorstore(chunk_snippets)
vectorstore

/tmp/chatlocal-chroma-db True False


  embeddings = HuggingFaceEmbeddings()
Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


<langchain_community.vectorstores.chroma.Chroma at 0x73ee883651f0>

In [None]:
!ls ../data/terraform-examples/aws/aws_elb/classic_elb/main.tf
!ls -lht /tmp/chatlocal-chroma-db

In [None]:
#### testing queries agains the Vector DB

retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
docs = retriever.get_relevant_documents("How do I configure an S3 bucket with versioning?")
for doc in docs:
    print("📄 Content:", doc.page_content)
    print("📎 Metadata:", doc.metadata)
    print(">>>", doc)
    print("-" * 40)

print("How do I deploy an observable static site using NGINX?")
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
docs = retriever.get_relevant_documents("How do I deploy an observable static site using NGINX?")
for doc in docs:
    print("📄 Content:", doc.page_content)
    print("📎 Metadata:", doc.metadata)
    print("-" * 40)

## 4. Combinando chunks extraídos antes de crear el prompt

Cuando se consulta a la BD vectorial, retorna "pedazos" que pueden ser pedazos de archivos. Para no perder contexto semántico, intentaremos combinar pedazos que pertenezcan a un mismo archivo o quizá incluir el archivo completo, siempre y cuando no sea muy grande. 


In [None]:
def group_by_filename(docs, max_total=12096):
    """Returns a dictionary with filenames with full contents of a file or only chunks
    depending on the file content size, wether this exeeds the `max_total' give.
    """
    grouped = {}
    size_used = 0
    doc_number = 0
    for doc in docs:
        print(f"used: {size_used}, doc number: {doc_number}")
        if (doc.metadata["filesize"] + size_used) > max_total:
            if (len(doc.page_content) + size_used) > max_total:
                # case where there are too much chunks
                print(f"Size used: {size_used}, Maybe too big files or too much chunks! This chunk size: {len(doc.page_content)}")
            else:
                key = f"{doc.metadata["filepath"]}_{doc_number}" if doc_number > 0 else doc.metadata["filepath"]
                print(f'Using chunk {key} ({len(doc.page_content)}) (entire: {doc.metadata["filesize"]})')
                grouped[key] = {
                    'metadata': {
                        'filename': doc.metadata['filename'],
                        'summary': doc.metadata['summary'],
                        'project': doc.metadata['project']
                    },
                    'content': doc.page_content
                }
                size_used += len(doc.page_content)
                doc_number += 1
        else:
            # if grouped.get(doc.metadata["filename"], None) is None:
            #     print('Already inserted, skipping')
            #     continue
            # use the whole file contents
            print(f'Using complete file {doc.metadata["filepath"]} ({doc.metadata["filesize"]})')
            with open(doc.metadata["filepath"], 'r') as f:
                grouped[doc.metadata["filepath"]] = {
                    'metadata': {
                        'filename': doc.metadata['filename'],
                        'summary': doc.metadata['summary'],
                        'project': doc.metadata['project']
                    },
                    'content': f.read()
                }
                #grouped[doc.metadata["filename"]] = f
                size_used += doc.metadata["filesize"]
    return grouped
    
combined_docs = group_by_filename(docs)
combined_docs

## 5. Creando el prompt

Con los pedazos extraídos de la BD vectorial se debe generar un prompt que le instruya al modelo que tome en cuenta prioritariamente los *chunks* extraídos de la BD vectorial y los use para dar la solución a la petición del usuario. Sino se encuentra solución entre ellos, generar una propia.

In [42]:
def generate_prompt_from_chunks(user_prompt, docs):
    """Generates a prompt for the given `user_prompt', using the extracted `docs'
    from vector DB.

    Returns: (str) with generated prompt
    """
    docs_formatted = ""
    for filepath, doc in docs.items():
        docs_formatted += f"""
        {filepath}:{doc['metadata']['filename']}
        {doc['metadata']['summary']}
        #-#-#
        {doc['content']}
        ----
        """
    
    prompt = f"""You will be given the following user request:
    {user_prompt}

    To give the solution here are some code snippets that are part of previous solutions
    already tested for the user not neccessarily for the specific user request, but are
    proven solutions. The snippets you will recieve are in format:
    (filepath):(filename)
    (summary_of_snippet)
    #-#-#
    (code_snippet)
    ------
    Here are the snippets:
    {docs_formatted}

    You will give more importance to these snippets to give the solution for the user request,
    when you use the snippets for the solution you will mention the reason you used one or more
    snippets with the format:
    (filepath):(filename)
    (reason)
    ----

    If you consider these snippets do not help to reach the requested solution, generate your
    own solutions mentioning that you did find enough reasons to use the snippets.
    """
    return prompt
print(generate_prompt_from_chunks('How do I deploy an observable static site using NGINX?', combined_docs))

You will be given the following user request:
    How do I deploy an observable static site using NGINX?

    To give the solution here are some code snippets that are part of previous solutions
    already tested for the user not neccessarily for the specific user request, but are
    proven solutions. The snippets you will recieve are in format:
    (filepath):(filename)
    (summary_of_snippet)
    #-#-#
    (code_snippet)
    ------
    Here are the snippets:
    
        ../data/terraform-examples/aws/aws_elb/classic_elb/main.tf:main.tf
        Creates a simple Classic Elastic Load Balancer with two EC2 Instances
        #-#-#
        # Summary: Creates a simple Classic Elastic Load Balancer with two EC2 Instances 
# Documentation: https://www.terraform.io/docs/language/settings/index.html
terraform {
  required_version = ">= 1.0.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 3.38"
    }
  }
}

# Documentation: https://www.terraform.io/do

## 6. Enviando al LLM

El prompt construido se envía a un modelo de lenguage extenso generativo.


In [43]:
##### USING deepseek-reasoner MODEL FROM API ########
from getpass import getpass

deepseek_api_key = getpass("YOUR DEEPSEEK API:")
print('thanks!')

###### LLM Client #####
from openai import OpenAI
client_deepseek = OpenAI(
    api_key=deepseek_api_key,
    base_url="https://api.deepseek.com")

YOUR DEEPSEEK API: ········


thanks!


In [44]:
def get_llm_api_response(client, prompt, model='deepseek-reasoner', temperature=0.17):
    print(f"Getting response from model {model}. Prompt lenght: {len(prompt)}")
    response = client.chat.completions.create(
        model=model,
        messages=[{
            "role": "system", 
            "content": "You are a devops engineer building infrastructure using AWS stack"
          },
          {
              "role": "user",
              "content": prompt
          }
        ],
        temperature=temperature,
    )
    return response

In [45]:
## Test complete user interaction

from IPython.display import Markdown
import json

REASONING_MODEL = 'deepseek-reasoner'

initial_prompt = input('Initial Prompt type here: ')

docs = retriever.get_relevant_documents(initial_prompt)
combined_docs = group_by_filename(docs)
prompt = generate_prompt_from_chunks(initial_prompt, combined_docs)
print()
response = get_llm_api_response(client_deepseek, prompt, REASONING_MODEL, 0.25)
print("======= RESPONSE ========")
#print(response.choices[0].message.content)

# Display with Markdown
Markdown(f"```terraform\n{response.choices[0].message.content}\n```")

Initial Prompt type here:  How do I deploy a static site (HTML+CSS+JS) with three instances of NGINX behind a load balancer.


used: 0, doc number: 0
Using complete file ../data/terraform-examples/aws/aws_elb/application_elb/main.tf (4079)
used: 4079, doc number: 0
Using complete file ../data/terraform-examples/aws/aws_elb/classic_elb/main.tf (3373)
used: 7452, doc number: 0
Using complete file ../data/terraform-examples/aws/aws_elb/network_elb/main.tf (3795)
used: 11247, doc number: 0
Using chunk ../data/terraform-examples/aws/aws_elb/classic_elb/main.tf (525) (entire: 3373)
used: 11772, doc number: 1
Size used: 11772, Maybe too big files or too much chunks! This chunk size: 419

Getting response from model deepseek-reasoner. Prompt lenght: 9929


```terraform
Based on the user request and the provided snippets, I'll create a Terraform solution to deploy a static site with three NGINX instances behind an Application Load Balancer. The first snippet (aws_elb/application_elb/main.tf) provides the most relevant foundation for this use case.

**Solution:**

```hxml
# main.tf
terraform {
  required_version = ">= 1.0.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 3.38"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

# Get default VPC and subnets
data "aws_vpc" "default" {
  default = true
}

data "aws_subnet_ids" "all" {
  vpc_id = data.aws_vpc.default.id
}

# Security group for ALB
resource "aws_security_group" "alb_sg" {
  name        = "static-site-alb-sg"
  description = "Allow HTTP traffic to ALB"
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# Security group for EC2 instances
resource "aws_security_group" "instance_sg" {
  name        = "static-site-instance-sg"
  description = "Allow HTTP from ALB and SSH for management"
  
  ingress {
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    security_groups = [aws_security_group.alb_sg.id]
  }

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"] # Restrict this in production
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# Target group for ALB
resource "aws_lb_target_group" "static_site" {
  name     = "static-site-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = data.aws_vpc.default.id

  health_check {
    path                = "/"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 3
    interval            = 30
    matcher             = "200-399"
  }
}

# Application Load Balancer
resource "aws_lb" "static_site" {
  name               = "static-site-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets            = data.aws_subnet_ids.all.ids
}

# ALB listener
resource "aws_lb_listener" "static_site" {
  load_balancer_arn = aws_lb.static_site.arn
  port              = 80
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.static_site.arn
  }
}

# Get latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

# User data script to install and configure NGINX
data "template_file" "user_data" {
  template = file("${path.module}/user_data.sh")
}

# Launch template for instances
resource "aws_launch_template" "static_site" {
  name                   = "static-site-launch-template"
  image_id               = data.aws_ami.amazon_linux.id
  instance_type          = "t2.micro"
  vpc_security_group_ids = [aws_security_group.instance_sg.id]
  user_data              = base64encode(data.template_file.user_data.rendered)

  tag_specifications {
    resource_type = "instance"
    tags = {
      Name = "static-site-instance"
    }
  }
}

# Auto Scaling Group with exactly 3 instances
resource "aws_autoscaling_group" "static_site" {
  name               = "static-site-asg"
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  desired_capacity   = 3
  max_size           = 3
  min_size           = 3
  target_group_arns  = [aws_lb_target_group.static_site.arn]

  launch_template {
    id = aws_launch_template.static_site.id
  }
}

# Output the ALB DNS name
output "alb_dns_name" {
  value = aws_lb.static_site.dns_name
}
```

```bash
# user_data.sh
#!/bin/bash
yum update -y
amazon-linux-extras install nginx1 -y
systemctl start nginx
systemctl enable nginx

# Create static site directory and deploy your content
mkdir -p /usr/share/nginx/html
# Add your static files here or use S3 sync
echo "<html><body><h1>Static Site Deployed Successfully</h1></body></html>" > /usr/share/nginx/html/index.html
```

**Snippet Usage Justification:**

`../data/terraform-examples/aws/aws_elb/application_elb/main.tf:main.tf`
This snippet was used because it provides a complete, working example of an Application Load Balancer configuration with Auto Scaling Group, which is exactly what's needed for serving a static site behind a load balancer. The ALB is the appropriate choice for HTTP/HTTPS traffic routing to web servers.

**Additional Notes:**
1. Replace the simple HTML in the user_data.sh with your actual static site content
2. For production use, consider:
   - Adding HTTPS termination with ACM certificate
   - Using S3 for static content storage
   - Implementing proper logging and monitoring
   - Restricting SSH access to specific IPs
3. The solution creates exactly 3 instances spread across 3 availability zones for high availability
4. Run `terraform apply` to deploy the infrastructure, then access your site via the ALB DNS name output
```