# Autodeployment Chat System — Colab Notebook (Aligned for Demo)
This Colab notebook implements a **demo-ready** backend flow that:
- Accepts **natural language** deployment instructions and a **GitHub repo URL or ZIP upload**.
- **Analyzes** the repo to infer framework, dependencies, start commands, and port.
- Generates an **infrastructure plan** (VM on AWS via Terraform for demo), with a dry-run option.
- Produces and runs **Terraform** to provision an EC2 instance (Ubuntu), installs runtime, deploys the app, and exposes the service.
- Performs a best-effort **replacement of `localhost`** with the instance's **public IP** inside common config and code files.
- Streams **logs** for analysis and offers **destroy** to tear down infra.

> Demo defaults to a single **EC2 VM** on AWS. Serverless/Kubernetes hooks are scaffolded for future work.

In [None]:
#@title 0) Setup & Dependencies
# This cell installs runtime dependencies and Terraform CLI on Colab.
# If you are running locally, make sure terraform is installed (>= 1.5).
# You can skip re-running once installed.

import os, sys, subprocess, shlex, json, re, base64, zipfile, pathlib, shutil, textwrap, glob, uuid, time

def run(cmd, env=None, cwd=None, check=True):
    print(f"$ {cmd}")
    p = subprocess.run(shlex.split(cmd), env=env, cwd=cwd, capture_output=True, text=True)
    print(p.stdout)
    if p.returncode != 0:
        print(p.stderr)
        if check:
            raise RuntimeError(f"Command failed: {cmd}")
    return p

# Detect if Terraform is installed; if not, attempt to install (Colab-friendly).
def ensure_terraform():
    try:
        run("terraform version", check=False)
        return
    except Exception:
        pass
    # Try apt-get install hashicorp repo if available; fallback to binary download otherwise.
    try:
        run("sudo apt-get update", check=False)
        run("sudo apt-get install -y gnupg software-properties-common curl unzip", check=False)
        run("curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg", check=False)
        run('echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(. /etc/os-release && echo $UBUNTU_CODENAME) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list', check=False)
        run("sudo apt-get update", check=False)
        run("sudo apt-get install -y terraform", check=False)
    except Exception as e:
        print("Apt path failed; trying direct download...")
        url = "https://releases.hashicorp.com/terraform/1.6.6/terraform_1.6.6_linux_amd64.zip"
        run(f"curl -L -o /tmp/terraform.zip {url}", check=True)
        run("unzip -o /tmp/terraform.zip -d /tmp", check=True)
        run("sudo mv /tmp/terraform /usr/local/bin/terraform", check=True)
        run("terraform version", check=True)

ensure_terraform()

# Python deps
!pip -q install gitpython ruamel.yaml tldextract >/dev/null 2>&1 || true

# Workspace layout
BASE = pathlib.Path.cwd() / "autodeploy_workspace"
INFRA = BASE / "infra" / "aws" / "vm_web"
REPO_DIR = BASE / "repo"
LOGS = BASE / "logs"
for p in [BASE, INFRA, REPO_DIR, LOGS]:
    p.mkdir(parents=True, exist_ok=True)

print("Workspace:", BASE)
print("Terraform dir:", INFRA)
print("Repo dir:", REPO_DIR)
print("Logs dir:", LOGS)

In [None]:
#@title 1) Inputs — Natural Language + Repo
# Provide deployment text and either a GitHub repo URL or upload a ZIP.
from ipywidgets import Text, Textarea, Dropdown, BoundedIntText, Password, Checkbox, Button, HBox, VBox, FileUpload, HTML, Output
from IPython.display import display, clear_output

nlp_input = Textarea(
    value="Deploy this Flask/Node app on AWS in us-east-1. Open port 3000 if needed.",
    placeholder="Natural language deployment instruction...",
    description="Instruction",
    layout={"width": "100%", "height": "100px"},
)

repo_url = Text(
    value="https://github.com/Arvo-AI/hello_world",
    placeholder="https://github.com/user/repo or leave blank to upload ZIP",
    description="GitHub URL",
    layout={"width": "100%"},
)

zip_upload = FileUpload(description="Upload ZIP", multiple=False)

region = Dropdown(
    options=[
        "us-east-1","us-east-2","us-west-1","us-west-2",
        "eu-west-1","eu-central-1","ap-south-1","ap-southeast-1","ap-southeast-2"
    ],
    value="us-east-1",
    description="AWS Region",
)

instance_type = Text(value="t3.micro", description="Instance")
open_port = BoundedIntText(value=3000, min=1, max=65535, description="Open Port")
allow_http = Checkbox(value=True, description="Open port 80 (HTTP)")
dry_run = Checkbox(value=False, description="Dry Run (no apply)")

aws_key_id = Text(value="", description="AWS_KEY_ID")
aws_secret = Password(value="", description="AWS_SECRET")
aws_session = Text(value="", description="AWS_SESSION")  # Optional

analyze_btn = Button(description="Analyze Repo", button_style="info")
plan_btn = Button(description="Generate Plan", button_style="warning")
apply_btn = Button(description="Provision & Deploy", button_style="success")
destroy_btn = Button(description="Destroy Infra", button_style="danger")

status = Output()

def save_zip_to_repo(upload_widget):
    REPO_DIR.mkdir(parents=True, exist_ok=True)
    # Clear repo dir
    for p in REPO_DIR.glob("*"):
        if p.is_file():
            p.unlink()
        else:
            shutil.rmtree(p, ignore_errors=True)
    if upload_widget.value:
        (_fname, fileinfo), = upload_widget.value.items()
        content = fileinfo["content"]
        zpath = BASE / "uploaded.zip"
        with open(zpath, "wb") as f:
            f.write(content)
        with zipfile.ZipFile(zpath, 'r') as zip_ref:
            zip_ref.extractall(REPO_DIR)
        return True
    return False

def clone_repo(url):
    # Clean repo dir
    for p in REPO_DIR.glob("*"):
        if p.is_file():
            p.unlink()
        else:
            shutil.rmtree(p, ignore_errors=True)
    if not url.strip():
        return False
    run(f"git clone --depth=1 {shlex.quote(url)} {shlex.quote(str(REPO_DIR))}", check=True)
    return True

import re, pathlib, json

def detect_app(repo_path):
    """Heuristic analysis of repo to infer application details."""
    out = {
        "framework": None, "language": None, "start_cmd": None, "port": None,
        "env": {}, "notes": [], "has_dockerfile": False
    }
    rp = pathlib.Path(repo_path)

    # Dockerfile?
    if (rp / "Dockerfile").exists():
        out["has_dockerfile"] = True
        out["notes"].append("Dockerfile found.")

    # Node.js detection
    pkg = rp / "package.json"
    if pkg.exists():
        out["language"] = "node"
        out["framework"] = "node"
        try:
            data = json.loads(pkg.read_text())
            scripts = (data.get("scripts") or {})
            # Prefer 'start' if present
            if "start" in scripts:
                out["start_cmd"] = "npm start"
            elif "serve" in scripts:
                out["start_cmd"] = "npm run serve"
            else:
                out["start_cmd"] = "node index.js"
            # try to infer port
            # common patterns: process.env.PORT || 3000 etc.
            for path in rp.rglob("*.js"):
                text = path.read_text(errors="ignore")
                m = re.search(r"(?:PORT|port)\D{0,3}(\d{2,5})", text)
                if m:
                    out["port"] = int(m.group(1)); break
            if out["port"] is None:
                out["port"] = 3000
            return out
        except Exception as e:
            out["notes"].append(f"package.json parse failed: {e}")

    # Python detection (Flask, Django, FastAPI)
    req = rp / "requirements.txt"
    pyproject = rp / "pyproject.toml"
    pipfile = rp / "Pipfile"
    found_py = any([req.exists(), pyproject.exists(), pipfile.exists(), list(rp.rglob("*.py"))])
    if found_py:
        out["language"] = "python"
        requirements = (req.read_text(errors="ignore").lower() if req.exists() else "")
        code_blobs = []
        for p in rp.rglob("*.py"):
            try:
                code_blobs.append(p.read_text(errors="ignore").lower())
            except Exception:
                pass
        code_all = "\n".join(code_blobs)

        if "flask" in requirements or "from flask" in code_all:
            out["framework"] = "flask"
            out["start_cmd"] = "gunicorn app:app --bind 0.0.0.0:$PORT"
            out["port"] = 5000
        elif "fastapi" in requirements or "from fastapi" in code_all:
            out["framework"] = "fastapi"
            out["start_cmd"] = "uvicorn app:app --host 0.0.0.0 --port $PORT"
            out["port"] = 8000
        elif "django" in requirements or "import django" in code_all:
            out["framework"] = "django"
            out["start_cmd"] = "gunicorn mysite.wsgi:application --bind 0.0.0.0:$PORT"
            out["port"] = 8000
        else:
            out["framework"] = "python_app"
            out["start_cmd"] = "python3 app.py"
            out["port"] = 8000

        # Try to find explicit port in code (e.g., app.run(..., port=1234))
        m = re.search(r"port\s*=\s*(\d{2,5})", code_all)
        if m:
            out["port"] = int(m.group(1))
        return out

    # Default fallback
    out["framework"] = "unknown"
    out["language"] = "unknown"
    out["start_cmd"] = None
    out["port"] = 8080
    out["notes"].append("Could not detect framework; defaulting to port 8080.")
    return out

analysis = {}
tfvars = {}

def on_analyze(_):
    with status:
        clear_output()
        print("== Analyzing input ==")
    ok = False
    if repo_url.value.strip():
        ok = clone_repo(repo_url.value.strip())
    else:
        ok = save_zip_to_repo(zip_upload)
    if not ok:
        with status:
            print("No repository provided. Please supply a GitHub URL or upload a ZIP.")
        return
    global analysis
    analysis = detect_app(REPO_DIR)
    analysis["region"] = region.value
    analysis["open_port"] = open_port.value
    analysis["allow_http"] = bool(allow_http.value)
    with status:
        print("Analysis:", json.dumps(analysis, indent=2))

def on_generate_plan(_):
    if not analysis:
        with status:
            print("Run Analyze first.")
        return
    # Decision policy: always VM for demo; scaffold others
    plan = {
        "strategy": "vm_ec2",
        "region": analysis["region"],
        "instance_type": instance_type.value.strip() or "t3.micro",
        "open_ports": list({analysis["open_port"], 80} if allow_http.value else {analysis["open_port"]}),
        "language": analysis["language"],
        "framework": analysis["framework"],
        "port": analysis["open_port"] or analysis["port"] or 8080,
        "start_cmd": analysis["start_cmd"],
        "repo_url": repo_url.value.strip() if repo_url.value.strip() else "uploaded_zip",
    }
    global tfvars
    tfvars = plan
    with status:
        clear_output()
        print("== Deployment plan ==")
        print(json.dumps(plan, indent=2))
        print("\nNext: click 'Provision & Deploy' to create infra (or enable Dry Run).")

def write_file(path, content):
    path.parent.mkdir(parents=True, exist_ok=True)
    with open(path, "w", newline="\n") as f:
        f.write(content)

def generate_terraform(plan):
    import uuid
    app_name = f"autodeploy-{uuid.uuid4().hex[:8]}"
    port = plan["port"]
    open_ports = plan["open_ports"]
    region = plan["region"]
    instance_type = plan["instance_type"]
    repo = plan["repo_url"]
    start_cmd = plan["start_cmd"] or ""

    variables_tf = f"""
variable "app_name" {{
  type = string
  default = "{app_name}"
}}

variable "instance_type" {{
  type    = string
  default = "{instance_type}"
}}

variable "region" {{
  type    = string
  default = "{region}"
}}

variable "open_ports" {{
  type    = list(number)
  default = [{", ".join(map(str, open_ports))}]
}}

variable "app_port" {{
  type    = number
  default = {port}
}}

variable "repo_url" {{
  type    = string
  default = "{repo}"
}}

variable "start_cmd" {{
  type    = string
  default = "{start_cmd}"
}}
"""

    main_tf = r"""
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
  required_version = ">= 1.5.0"
}

provider "aws" {
  region = var.region
}

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

resource "aws_security_group" "app_sg" {
  name        = "${var.app_name}-sg"
  description = "Security group for ${var.app_name}"

  dynamic "ingress" {
    for_each = var.open_ports
    content {
      description = "App port"
      from_port   = ingress.value
      to_port     = ingress.value
      protocol    = "tcp"
      cidr_blocks = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }
}

resource "aws_instance" "app" {
  ami                         = data.aws_ami.ubuntu.id
  instance_type               = var.instance_type
  associate_public_ip_address = true
  vpc_security_group_ids      = [aws_security_group.app_sg.id]

  user_data = templatefile("${path.module}/user_data.sh.tftpl", {
    REPO_URL  = var.repo_url,
    APP_PORT  = var.app_port,
    START_CMD = var.start_cmd
  })

  tags = {
    Name = var.app_name
  }
}

output "public_ip" {
  value = aws_instance.app.public_ip
}

output "public_dns" {
  value = aws_instance.app.public_dns
}
"""

    user_data_tpl = r"""
#!/bin/bash
set -euxo pipefail

export DEBIAN_FRONTEND=noninteractive
apt-get update -y
apt-get install -y curl git jq unzip python3-pip python3-venv

# Install Node (LTS) from Nodesource
curl -fsSL https://deb.nodesource.com/setup_18.x | bash -
apt-get install -y nodejs

# Discover public IP for localhost replacement
PUBIP=$(curl -s http://169.254.169.254/latest/meta-data/public-ipv4 || echo "127.0.0.1")

# Create app folder
mkdir -p /opt/app
cd /opt/app

# Fetch repo
git clone --depth=1 "${REPO_URL}" app || true
cd app

# Replace localhost with public IP in common text files
grep -rIl "localhost" . | xargs -r sed -i "s/localhost/${PUBIP}/g" || true

# Try Node first
if [ -f package.json ]; then
  npm install --omit=dev || npm install
  APPSTART="${START_CMD:-npm start}"
  if ! grep -q '"start"' package.json; then
    APPSTART="node index.js"
  fi
  RUNTIME="node"
else
  # Python path
  python3 -m venv .venv
  . .venv/bin/activate
  if [ -f requirements.txt ]; then pip install -r requirements.txt || true; fi
  if [ -f pyproject.toml ]; then pip install . || true; fi
  APPSTART="${START_CMD:-gunicorn app:app --bind 0.0.0.0:${APP_PORT}}"
  RUNTIME="python"
fi

# Create systemd service
cat >/etc/systemd/system/app.service <<EOF
[Unit]
Description=Autodeploy App
After=network.target

[Service]
WorkingDirectory=/opt/app/app
ExecStart=/bin/bash -lc 'PORT=${APP_PORT} ${APPSTART}'
Restart=always
Environment=PORT=${APP_PORT}
Environment=HOST=0.0.0.0
User=root

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable app.service
systemctl restart app.service
"""

    outputs_tf = r"""
output "app_url_http" {
  value = "http://${aws_instance.app.public_ip}:${var.app_port}"
}
"""

    write_file(INFRA / "variables.tf", variables_tf)
    write_file(INFRA / "main.tf", main_tf)
    write_file(INFRA / "outputs.tf", outputs_tf)
    write_file(INFRA / "user_data.sh.tftpl", user_data_tpl)

def on_apply(_):
    if not tfvars:
        with status:
            print("Generate Plan first.")
        return

    generate_terraform(tfvars)

    env = os.environ.copy()
    if aws_key_id.value and aws_secret.value:
        env["AWS_ACCESS_KEY_ID"] = aws_key_id.value
        env["AWS_SECRET_ACCESS_KEY"] = aws_secret.value
        if aws_session.value:
            env["AWS_SESSION_TOKEN"] = aws_session.value
    else:
        with status:
            print("Warning: No AWS credentials supplied in widgets. If your environment already has credentials, that's fine.")

    # Terraform workflow
    try:
        run("terraform -chdir=%s init" % str(INFRA), env=env, check=True)
        run("terraform -chdir=%s validate" % str(INFRA), env=env, check=False)
        plan_out = LOGS / "plan.out"
        run(f"terraform -chdir={str(INFRA)} plan -out={str(plan_out)}", env=env, check=True)
        if dry_run.value:
            with status:
                print("Dry run enabled — not applying.")
            return
        run(f"terraform -chdir={str(INFRA)} apply -auto-approve {str(plan_out)}", env=env, check=True)
        # Show outputs
        p = run(f"terraform -chdir={str(INFRA)} output -json", env=env, check=True)
        try:
            outs = json.loads(p.stdout)
            with status:
                print("== Outputs ==")
                print(json.dumps(outs, indent=2))
        except Exception:
            pass
    except Exception as e:
        with status:
            print("Provisioning failed:", e)

def on_destroy(_):
    env = os.environ.copy()
    if aws_key_id.value and aws_secret.value:
        env["AWS_ACCESS_KEY_ID"] = aws_key_id.value
        env["AWS_SECRET_ACCESS_KEY"] = aws_secret.value
        if aws_session.value:
            env["AWS_SESSION_TOKEN"] = aws_session.value
    try:
        run(f"terraform -chdir={str(INFRA)} destroy -auto-approve", env=env, check=False)
    finally:
        with status:
            print("Destroy attempted; check logs above for details.")

analyze_btn.on_click(on_analyze)
plan_btn.on_click(on_generate_plan)
apply_btn.on_click(on_apply)
destroy_btn.on_click(on_destroy)

display(
    VBox([
        nlp_input,
        repo_url,
        zip_upload,
        HBox([region, instance_type, open_port, allow_http]),
        HBox([dry_run]),
        HTML("<b>AWS Credentials (optional here if your Colab has them already)</b>"),
        HBox([aws_key_id, aws_secret, aws_session]),
        HBox([analyze_btn, plan_btn, apply_btn, destroy_btn]),
        status,
    ])
)

## Notes & Policy Alignment (for Demo)
- **Terraform** is the provisioning backbone (VM on AWS).  
- **Minimal intervention**: one-click Analyze → Plan → Provision.  
- **Generality**: Heuristics support **Node**, **Flask/Django/FastAPI**, and a fallback Python app.  
- **Adjustments**: Attempts to replace `localhost` with the instance **public IP** and binds to `0.0.0.0`.  
- **Logs**: Terraform logs and outputs are printed inline; systemd manages the app lifecycle.  
- **Future hooks**: Stubs allow extending to serverless (Lambda/API GW) or containers (ECS/EKS).  

### How to Demo Quickly
1. Click **Analyze Repo** with `https://github.com/Arvo-AI/hello_world` or upload a ZIP.  
2. Click **Generate Plan** and review inferred framework/port.  
3. Supply AWS creds (or rely on attached Colab secrets), then click **Provision & Deploy**.  
4. When done, click **Destroy Infra**.

> If you lack AWS creds, enable **Dry Run** to show the full plan without applying.