# AutoGluon Assistant - Quick Start

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/autogluon/autogluon-assistant)
[![Open In SageMaker Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://github.com/autogluon/autogluon-assistant)

(Links above are still WIP)

In this tutorial, we will see how to use AutoGluon Assistant (AG-A) to solve machine learning problems **with zero line of code**. AG-A combines the power of AutoGluon's state-of-the-art AutoML capabilities with Large Language Models (LLMs) to automate the entire data science pipeline.

We will cover:
- Setting up AutoGluon Assistant
- Preparing Your Data
- Using AutoGluon Assistant

By the end of this tutorial, you'll be able to run your data with our highly accurate ML solutions using just natural language instructions. Let's get started with the installation!

## Setting up AutoGluon Assistant
Getting started with AutoGluon Assistant is straightforward. Let's install it directly using pip:

In [None]:
!pip install git+https://github.com/autogluon/autogluon-assistant.git#egg=autogluon-assistant[dev]

AutoGluon Assistant supports two LLM providers: AWS Bedrock (default) and OpenAI. Choose one of the following setups:

In [None]:
# Option A: AWS Bedrock (Recommended)
!export BEDROCK_API_KEY='4509...'
!export AWS_DEFAULT_REGION='<your-region>'
!export AWS_ACCESS_KEY_ID='<your-access-key>'
!export AWS_SECRET_ACCESS_KEY='<your-secret-key>'
### OR ###
# Option B: OpenAI
!export OPENAI_API_KEY='sk-...'

*Note: If using OpenAI, we recommend a paid API key rather than a free-tier account to avoid rate limiting issues.*

Let's verify the installation by importing the package:

In [None]:
import autogluon_assistant

print(autogluon_assistant.__version__)


Now that you have AutoGluon Assistant installed and configured, let's move on to preparing your data directory structure for your first ML project!

## Preparing Your Data

For this tutorial, we'll use the classic Titanic dataset which is perfect for getting started with machine learning. The goal is to predict whether a passenger survived based on their characteristics such as age, gender, ticket class, and other features. We sampled 1000 training and test examples from the original data. The sampled dataset make this tutorial run quickly, but AutoGluon Assistant can handle the full dataset if desired.

Let's download the example data:

In [None]:
import requests, os

# Create directory and download example files
os.makedirs("./toy_data", exist_ok=True)
for f in ["train.csv", "test.csv", "descriptions.txt"]:
    open(f"toy_data/{f}", "wb").write(
        requests.get(f"https://raw.githubusercontent.com/autogluon/autogluon-assistant/main/toy_data/{f}").content
    )

That's it! We now have:

- `train.csv`: Training data with labeled examples
- `test.csv`: Test data for making predictions
- `descriptions.txt`: A description of the dataset and task

Let's take a quick look at our training data and description file:

In [None]:
import pandas as pd
train_data = pd.read_csv("toy_data/train.csv")
train_data.head()

In [None]:
with open('toy_data/descriptions.txt', 'r') as f:
    print(f.read())

## Using AutoGluon Assistant

Now that we have our data ready, let's use AutoGluon Assistant to build our ML model. The simplest way to use AutoGluon Assistant is through the command line - no coding required! After installing the package, you can run it directly from your terminal:

In [None]:
#TODO: remove the requirement of config files
!autogluon-assistant ./toy_data

Let's also look at how to use AutoGluon Assistant programmatically in Python:

In [None]:
from autogluon_assistant import AutogluonAssistant

# Initialize the assistant
assistant = AutogluonAssistant()

# Run the assistant
output_file = assistant.predict(data_dir="./toy_data")

Let's examine the predictions:

In [None]:
predictions = pd.read_csv(output_file)
print("\nFirst few predictions:")
print(predictions.head())

## Conclusion

In this quickstart tutorial, we saw how AutoGluon Assistant simplifies the entire ML pipeline by allowing users to solve machine learning problems with minimal efforts. With just a data directory, AutoGluon Assistant handles the entire process from data understanding to prediction generation. Check out the other tutorials to learn more about customizing the configuration (WIP), using different LLM providers, and handling various types of ML tasks.

Want to dive deeper? Explore our GitHub repository for more advanced features and examples.