# Introduction to Automation with LangChain, Generative AI, and Python
**4.1: Image Generation**
* Instructor: [Jeff Heaton](https://youtube.com/@HeatonResearch), WUSTL Center for Analytics and Business Insight (CABI), [Washington University in St. Louis](https://olin.wustl.edu/faculty-and-research/research-centers/center-for-analytics-and-business-insight/index.php)
* For more information visit the [class website](https://github.com/jeffheaton/cabi_genai_automation).

## Automatic TA

We will now see how we can use a LLM to act as a teaching assistant, and answer emails for an instructor. We begin by defining a model and a simple function to load data from URLs.

In [2]:
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import PyPDFLoader, TextLoader
from langchain import OpenAI, PromptTemplate
from langchain_aws import ChatBedrock
from IPython.display import display_markdown
import requests

MODEL = 'anthropic.claude-3-sonnet-20240229-v1:0'

# Initialize bedrock, use built in role
llm = ChatBedrock(
    model_id=MODEL,
    model_kwargs={"temperature": 0.0},
)

def fetch_url_content(url):
    try:
        # Send a GET request to the URL
        response = requests.get(url)
        
        # Check if the request was successful
        response.raise_for_status()
        
        # Return the content as a string
        return response.text
    
    except requests.RequestException as e:
        # Return the error message if the request failed
        return f"An error occurred: {e}"


We create a simple chain that provides instructions and expects four pieces of information to process.

* [email](https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_email_1.txt) - The email that the student wrote, asking for help.
* [code](https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_code_1.txt) - Any code that the student provided for the question.
* [instructions](https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_instructions_1.txt) - The instructions the professor gave.
* [solution](https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_solution_1.txt) - The solution to this assignment, created by the professor.

The following code sets up this prompt and chain.

In [3]:
email_prompt_template = PromptTemplate( 
    input_variables = ['email', 'code', 'instructions', 'solution'], 
    template = """
You are an assistant for Jeff Heaton, who is an adjunct instructor at Washington University
in St. Louis. You should provide helpful responses to students via email queries that they
send you. A student in my course is asking for help on an assignment. 
Please compose an email giving them help and code suggestions, but do not give them the entire 
solution. 

The student's email is given between the backwards ticks ``` and ```.
```{email}```

The student's code is given between the backward ticks ``` and ```.
```{code}```

The assignment instructions is given between the backwards ticks ``` and ```.
```{instructions}```

The solution to this assignment is given here between the backwards ticks ``` and ```.
```{solution}```
""")

chain = email_prompt_template | llm


We now call this chain with the provided data.

In [4]:
email = fetch_url_content("https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_email_1.txt")
code = fetch_url_content("https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_code_1.txt")
instructions = fetch_url_content("https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_instructions_1.txt")
solution = fetch_url_content("https://data.heatonresearch.com/wustl/CABI/genai-langchain/autohelp/autohelp_solution_1.txt")

message = chain.invoke({
    'email':email, 
    'code':code, 
    'instructions':instructions, 
    'solution':solution})

We now examine the formatted response from the AI.

In [6]:
from IPython.display import display_markdown
display_markdown(message.content,raw=True)

Dear Joe,

Thank you for your email and for sharing your code. I understand your concerns, and I'll try to provide some guidance to help you with the assignment.

1. **Model Selection (Regression or Classification):**
The decision to use regression or classification models depends on the nature of the target variable (the variable you want to predict). In your case, since the target variable is a continuous numeric value (house price), it is indeed a regression problem. Using a regression model is the correct approach.

2. **Loss Function:**
For regression problems, the most common loss function is the mean squared error (MSE). This loss function measures the squared difference between the predicted values and the actual values. Cross-entropy loss is typically used for classification problems.

3. **High Loss and Slow Training:**
The high loss values and slow training time could be due to several reasons:
- **Data Scaling**: It's essential to scale or normalize your input features before training a neural network. This helps the model converge faster and achieve better performance. You can use techniques like standardization (z-score normalization) or min-max scaling.
- **Model Architecture**: The architecture of your neural network (number of layers, number of neurons per layer, activation functions) can significantly impact the model's performance. You may need to experiment with different architectures to find the one that works best for your problem.
- **Learning Rate**: The learning rate is a hyperparameter that controls the step size during the optimization process. If the learning rate is too high, the model may diverge or oscillate, leading to high loss values. If it's too low, the training may be slow. You can try adjusting the learning rate or using techniques like learning rate schedulers.

4. **Prediction Interpretation:**
The small values (around 0.01 and 0.001) you're observing are likely due to the way the model is predicting the target variable. Neural networks often output values in a different scale than the original target variable. To interpret the predictions correctly, you may need to apply an inverse transformation or scaling to map the predicted values back to the original scale of the target variable.

Here are some suggestions for your code:

```python
# Standardize input features
x = df_houses_train[["bedrooms", "bathrooms", "garage", "land", "sqft", "median_income"]].values
x = (x - x.mean(axis=0)) / x.std(axis=0)

# Try different model architectures
model = Sequential()
model.add(Dense(64, input_dim=x.shape[1], activation="relu"))
model.add(Dense(32, activation="relu"))
model.add(Dense(1))  # Output layer for regression

# Adjust learning rate
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optimizer, loss="mean_squared_error")

# Monitor validation loss during training
early_stop = EarlyStopping(monitor='val_loss', patience=10)
model.fit(x_train, y_train, validation_data=(x_val, y_val), epochs=100, callbacks=[early_stop])
```

I hope these suggestions help you improve your model's performance. If you have any further questions or need additional assistance, feel free to reach out.

Best regards,
[Your Name]