# Lab 9 Part 2

In [1]:
pip install openai -q

Note: you may need to restart the kernel to use updated packages.


In [2]:
import boto3
import json
from IPython.display import display, Image, Markdown
from openai import OpenAI
import pandas as pd
from pprint import pprint

In [3]:
def get_secret(secret_name):
    region_name = "us-east-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        raise e

    secret = get_secret_value_response['SecretString']
    
    return json.loads(secret)

In [4]:
client = OpenAI(api_key= get_secret('openai')['api_key'])

In [5]:
df = pd.read_csv('data_lab9.csv')
data_json = df.to_json(orient="records")
df.head()

Unnamed: 0,State,Votes,foreign_pop_edited,Location,afd_percent_edited
0,Baden-Württemberg,1256267,0.185,West,0.198
1,Bavaria,1515533,0.16,West,0.19
2,Berlin,296999,0.233,East,0.152
3,Brandenburg,535279,0.075,East,0.325
4,Bremen,52494,0.219,West,0.151


In [6]:
data_prompt = f"Analyze the provided data and determine the relationship between foreign population and AFD support in German states. Provide Python-generated charts to support your conclusion. Data: {data_json}"
# print(prompt)

In [7]:
def openai_gpt_help(prompt):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model='gpt-4o',
        messages=messages,
        temperature = 0
    )
    token_usage = response.usage
    
    pprint(f"Tokens used: {token_usage}")

    return response.choices[0].message.content

In [8]:
gpt_result = openai_gpt_help(prompt=data_prompt)

('Tokens used: CompletionUsage(completion_tokens=1111, prompt_tokens=578, '
 'total_tokens=1689, '
 'completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, '
 'audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), '
 'prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))')


In [9]:
display(Markdown(gpt_result))

To analyze the relationship between the foreign population and AfD support in German states, we can use Python to create a scatter plot. This will help visualize any correlation between the percentage of the foreign population and the percentage of votes for the AfD party. Let's proceed with the analysis using Python and the `matplotlib` and `pandas` libraries.

```python
import pandas as pd
import matplotlib.pyplot as plt

# Data
data = [
    {"State": "Baden-Württemberg", "Votes": 1256267, "foreign_pop_edited": 0.185, "Location": "West", "afd_percent_edited": 0.198},
    {"State": "Bavaria", "Votes": 1515533, "foreign_pop_edited": 0.16, "Location": "West", "afd_percent_edited": 0.19},
    {"State": "Berlin", "Votes": 296999, "foreign_pop_edited": 0.233, "Location": "East", "afd_percent_edited": 0.152},
    {"State": "Brandenburg", "Votes": 535279, "foreign_pop_edited": 0.075, "Location": "East", "afd_percent_edited": 0.325},
    {"State": "Bremen", "Votes": 52494, "foreign_pop_edited": 0.219, "Location": "West", "afd_percent_edited": 0.151},
    {"State": "Hamburg", "Votes": 113463, "foreign_pop_edited": 0.2, "Location": "West", "afd_percent_edited": 0.109},
    {"State": "Hesse", "Votes": 636771, "foreign_pop_edited": 0.194, "Location": "West", "afd_percent_edited": 0.178},
    {"State": "Mecklenburg-Vorpommern", "Votes": 357356, "foreign_pop_edited": 0.07, "Location": "East", "afd_percent_edited": 0.35},
    {"State": "Lower Saxony", "Votes": 894057, "foreign_pop_edited": 0.123, "Location": "West", "afd_percent_edited": 0.178},
    {"State": "North Rhine-Westphalia", "Votes": 1769822, "foreign_pop_edited": 0.161, "Location": "West", "afd_percent_edited": 0.168},
    {"State": "Rhineland-Palatinate", "Votes": 498733, "foreign_pop_edited": 0.142, "Location": "West", "afd_percent_edited": 0.201},
    {"State": "Saarland", "Votes": 129201, "foreign_pop_edited": 0.148, "Location": "West", "afd_percent_edited": 0.216},
    {"State": "Saxony", "Votes": 958363, "foreign_pop_edited": 0.081, "Location": "East", "afd_percent_edited": 0.373},
    {"State": "Saxony-Anhalt", "Votes": 496100, "foreign_pop_edited": 0.08, "Location": "East", "afd_percent_edited": 0.371},
    {"State": "Schleswig-Holstein", "Votes": 306191, "foreign_pop_edited": 0.107, "Location": "West", "afd_percent_edited": 0.163},
    {"State": "Thuringia", "Votes": 510519, "foreign_pop_edited": 0.083, "Location": "East", "afd_percent_edited": 0.386}
]

# Create a DataFrame
df = pd.DataFrame(data)

# Plot
plt.figure(figsize=(10, 6))
for location, color in zip(['West', 'East'], ['blue', 'red']):
    subset = df[df['Location'] == location]
    plt.scatter(subset['foreign_pop_edited'], subset['afd_percent_edited'], label=location, color=color)

plt.title('Relationship between Foreign Population and AfD Support in German States')
plt.xlabel('Foreign Population (%)')
plt.ylabel('AfD Support (%)')
plt.legend(title='Location')
plt.grid(True)
plt.show()
```

### Analysis

- **Scatter Plot**: The scatter plot shows the relationship between the percentage of the foreign population and the percentage of AfD support in each state. The data points are color-coded based on whether the state is in the East or West of Germany.

- **Observations**:
  - States in the East (red) generally have a lower percentage of foreign population and higher AfD support.
  - States in the West (blue) tend to have a higher percentage of foreign population and lower AfD support.
  - There appears to be a negative correlation between the foreign population percentage and AfD support, especially when considering the East and West separately.

This analysis suggests that states with a higher foreign population percentage tend to have lower support for the AfD party, particularly in the West. Conversely, states with a lower foreign population percentage, especially in the East, tend to have higher AfD support.

## Reflection

#### How the model's reasoning supported your analysis
The model's reasoning supported the findings of our capstone project- that East German states have lower percentages of foreign population but higher AfD support, and vice versa (where Western states support the AfD less and have greater foreign populations). The model came to the same conclusions considering the data and outlined a negative correlation between foreign population and AfD support (when one goes in one direction, another goes in the other).
#### Whether this approach could be applied to real-world intelligence workflows
This approach could definitely be applied to real world intelligence flows. This approach came to the same conclusion that we came to through past data visualizations and further research. There is a vast area of opportunity for similar models to be created for varying topics. 
#### Any limitations or ethical concerns you encountered
There were no obvious limitations or ethical concerns within this specific model, but there is opportunity for it. The way that someone constructs their prompt leaves room for bias and ethical concerns, potentially skewing the results a certain way.