# Column Sorting Challenge

## Column header solution
In this Notebook we are mapping the values from origin dataset to target columns based on column headers names.

The solution is divided into the following steps:

  - Load OPENAI API key from .env file
  - Load dataset from remote CSV using requests.
  - Create a list of target columns for each dataset
  - Load gpt-3.5-turbo model as an agent using Langchain
  - Define a simple function which passes the entire CSV value as raw text with target columns to the LLM model

To execute this notebook, adjust the environment variables in the .env file and run the cells below.


In [1]:
!pip install -U -q langchain langchain-community langchain-openai langchainhub openai pandas python-dotenv

In [2]:
import requests
from dotenv import load_dotenv
from langchain import hub
from langchain.agents import AgentExecutor, create_json_chat_agent
from langchain_openai import ChatOpenAI

In [3]:
load_dotenv('.env')
csv_input = requests.request(
    'GET',
    'https://raw.githubusercontent.com/bsantanna/notebooks/master/challenges/column_sorting/data/cadastro.csv'
).text
target_columns = "nome,localizacao,cpf_cnpj"

In [4]:
# initializes JSON agent
# Reference: https://python.langchain.com/v0.1/docs/modules/agents/agent_types/json_agent/
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
prompt = hub.pull("hwchase17/react-chat-json")
tools = []
agent = create_json_chat_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=False,
    handle_parsing_errors=True
)

In [5]:
prompt = f"""
Your task is to read a CSV_INPUT and a TARGET_COLUMNS list and return an output containing adjusted values into proper columns.

CSV_INPUT:{csv_input}

TARGET_COLUMNS:{target_columns}

"""
response = agent_executor.invoke({"input": prompt})

In [6]:
print(response["output"])

The output with adjusted values into proper columns based on the provided CSV_INPUT and TARGET_COLUMNS is as follows:

| nome                        | localizacao | cpf_cnpj        |
|-----------------------------|-------------|-----------------|
| Jorge Esteves               | São Paulo   | 608.243.480-36  |
| Maria da Silva              | Rio Janeiro | 20325634017     |
| Associação de Moradores da Vila X | São Paulo   | 17.084.064/0001-10 |
