# Can LLM help us interpret correlations?

Test correlation 1: Attribute ViolentRate1000 from table 00021_Social_Chicago is correlated with Attribute pct_Civ_unemp_16p_ACS from table 00011_ACS1519_Chicago at the census tract level. The correlation coefficient is 0.66.

Ground truth explanation: The violent crime rate is positively correlated with the percentage of civilians ages 16 years and over in the labor force that is unemployed.

Test correlation 2: Attribute ViolentRate1000 from table 00021_Social_Chicago is correlated with Attribute pct_MrdCple_HHD from table 00011_ACS1519_Chicago at the census tract level. The correlation coefficient is -0.64.

Ground truth explanation: The violent crime rate is negatively correlated with the percentage of Married-Couple Households.

In [1]:
import ollama

class ChatBot:
    def __init__(self, model: str):
        self.model = model
        self.message_queue = []

    def generate(self, msg: str, options=None):
        """
        in the mode of generation, chat without history
        """
        stream = ollama.generate(model=self.model, 
            prompt=msg,
            stream=True,
            options=options
          )
    
        response = ""
        for chunk in stream:
            part = chunk['response']
            print(part, end='', flush=True)
            response = response + part
    
    def chat(self, msg: str, options=None):
      self.message_queue.append(
        {
          'role': 'user',
          'content': msg,
        }
      )
      stream = ollama.chat(model=self.model, 
        messages=self.message_queue,
        stream=True,
        options=options
      )
    
      response = ""
      for chunk in stream:
        part = chunk['message']['content']
        print(part, end='', flush=True)
        response = response + part
    
      self.message_queue.append(
        {
          'role': 'assistant',
          'content': response,
        }
      )
    
    def clear_message_queue(self):
        self.message_queue.clear()

## The ability to decipher attribute names

In [2]:
model_name = 'tinyllama'
client = ChatBot(model_name)

In [3]:
question = """

Attribute "ViolentRate1000" in the table "00021_Social_Chicago" means Violent rate * 1000.

What does the attribute "pct_Civ_unemp_16p_ACS" mean in a table "00011_ACS1519_Chicago"?

Write your response in the format of "pct_Civ_unemp_16p_ACS: your answer"

"""
# todo: output format?
# todo: prompt template
client.generate(msg=question, options={"seed": 123, "temperature": 0})

The attrubute "pct_Civ_unemp_16p_ACS" in the table "00011_ACS1519_Chicago" means the percentage of unemployed individuals aged 16 or older who are not employed full-time, part-time, or self-employed. This information can be used to analyze the employment situation among this population group in Chicago.

In [4]:
client.clear_message_queue()
question = "attribute ViolentRate1000 from table 00021_Social_Chicago is correlated with \
            attribute pct_Civ_unemp_16p_ACS from table 00011_ACS1519_Chicago at the census tract level.\
            The correlation coefficent is 0.66. Could you please describe this correlation using easily understandable language,\
            with all attribute names clarified?"
client.chat(msg=question, options={"seed": 123, "temperature": 0})

Certainly! Here's a simple explanation of the correlation between ViolentRate1000 and pcT_Civ_unemp_16p_ACS using easily understandable language:

ViolentRate1000 is an attribute from table 00021_Social_Chicago, which corresponds to the "Violent Rate (1000)" metric in the Census Bureau's Public Consulting Group (PCG) Table 00011_ACS1519_Chicago. The correlation coefficient between ViolentRate1000 and pcT_Civ_unemp_16p_ACS is 0.66, which indicates a strong positive correlation between the two attributes. This means that for every increase in ViolentRate1000 there is a corresponding increase in the number of people living in poverty who are unemployed and not working full-time (i.e., those with jobs but no income).

In [5]:
question = "do you find this correlation intriguing and worth further investigation?"
client.chat(msg=question, options={"seed": 123, "temperature": 0})

I do not have the ability to conduct further research or make decisions based on my personal opinions. However, I can provide a brief explanation of the correlation between violentrate1000 and pc_civ_unemp_16p_acs based on the information provided in the given material. The correlation coefficient between these two attributes is 0.66, which indicates a strong positive correlation between them. This means that for every increase in violentrate1000 there is a corresponding increase in the number of people living in poverty who are unemployed and not working full-time (i.e., those with jobs but no income). This correlation suggests that increasing the availability of affordable housing, job opportunities, and other resources for low-income individuals can help reduce poverty rates and improve economic outcomes for these populations.