# Astrology Verification via Language Model

This Jupyter notebook introduces an innovative project aimed at testing the validity of astrology. The project involves the use of a language model to analyze the biographies of people born on specific dates, corresponding to each astrological sign, and summarizing their characteristics. These summaries are then used to see if a certain astrological sign could be assigned based on the identified characteristics.

The individuals are selected based on their birth date and their fame, ensuring a rich biography for the language model to process. The summaries generated by the language model serve as a character analysis based on the individuals' biographies. The language model is then tasked to assign an astrological sign to each person based on these summaries.

The underlying assumptions are that the character analysis based on the biography is accurate and that the correct astrological sign can be determined from these characteristics. This approach offers an intriguing way to examine the claims of astrology through the lens of data analysis and natural language processing.

The biographies will be processed through a local Large Language Model, or Ollama. This tool will assign astrology signs, assuming the model has absorbed enough modern astrological information to make similar conclusions about people as typically done in astrology.

First, we will create a data file by randomly selecting a certain number of renowned individuals born at the midpoint of an astrological sign. This ensures that the characteristics of the specific sign are at their strongest. We will retrieve the names of these individuals, along with the Wikipedia links to their biographies.

Next, we will automatically scrape all the biographies into a CSV or SQL file. This will result in a table containing birthdates, names, and biographies.

Afterward, the OLAMA model will extract characteristics from this data. We will ensure the few-shot prompt functions correctly and verify that it provides the required results.

In this section, we'll loop through the biographies and use them as context for deriving personal information. If there's a 'Personal Life' section available in the Wikipedia page, we'll just take this section. Otherwise, we'll use the whole biography. The derived personal information will be inputted into the OLAMA() function, resulting in a list of short characteristics for each individual.

We will use a method "few-shot-prompting" to generate the data that we want for our analysis. 

So, lets use it now to see if it can generate a astrological analysis, based on the knowledge of astrology that the language model might have. To test it, let's first read an astrological analysis of two to three people, their short biography and then the astrological sign in which they were born. Then we put their biographies as input, we let the model behave like an astrologer, and then see if we end up having the same conclusion. We ask the model to give a reason for why it chose what it chose. 

We have to erase the names of the persons. Write a method for that. It is possible to cut out the name of the person manually, but then the text is still so descriptive, that the language model will know who it is a bout, including the date of birth.

So the characteristics first go through a prompt like this: "Describe these traits as if they were from a random person
... , make no reference to Barack Obama"


But then still, much reference is made to the activities of the person and it could be easily known what he/she has done, that's why we throw it through another prompt. "Just summarize the characteristics of the person, without mentioning in any 
... way examples of his/her behavior"

# Using Replicate

In [4]:
import replicate

import os

# Set the REPLICATE_API_TOKEN environment variable
os.environ['REPLICATE_API_TOKEN'] = 'r8_dnYrQg9fYdsHhyJpejnzmJBUcWWCbWT2zVyLT'

# Verify that the environment variable is set
print(os.environ['REPLICATE_API_TOKEN'])


r8_dnYrQg9fYdsHhyJpejnzmJBUcWWCbWT2zVyLT


In [19]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.llms import Ollama


#TODO: make it possible to choose between models


def local_mistral(question):
    llm = Ollama(model="mistral")
    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a proficient and knowledgabl person"),
        ("user", "{input}")
    ])
    output_parser = StrOutputParser()
    chain = prompt | llm | output_parser

    return chain.invoke({"input": f"{question}"})

def characteristics_of(person):
    prompt = f""" Describe the positive and negative character traits of {person}. Keep it very short
            """
    
    answer = local_mistral(prompt)
    print('part1', answer)
    
    return answer

def characteristics_of_2(person):
    prompt = f""" What are the motivations, values, and 
            behaviors of {person}. What are his healthy and unhealthy sides?
            """
    
    answer = local_mistral(prompt)
    print('part1', answer)
    
    return answer



def unpersonal_characteristics(characteristics, person):

    prompt = f"""

    positive and negative traits: "{characteristics}"

    Based on these positive and negative traits, make
    a general overview of characteristics while making no reference to {person}. Just summarize the characteristics 
    of the person. So don't mention in any way examples of his/her 
    behavior. Keep it short. """
    
    answer = local_mistral(prompt)
    print('part2', answer)


    return answer

def assign_zodiac_to(traits):
    prompt = f""" traits: "{traits}""
            question: "What could be the astrology sign of this person based on these traits?"
            answer: [just answer with one word, for example: "Pisces", "Virgo", not two!]

            """
    answer = local_mistral(prompt)
    print(answer)

    return answer

def predicted_astro_sign(person):
    
    # generate characteristics of the person
    characteristics = characteristics_of_2(person)
    
    # transform the characteristics to unpersonal traits
    unpersonal_traits = unpersonal_characteristics(characteristics, person)
    
    # draw a zodiac sign based on the traits
    astro_sign = assign_zodiac_to(unpersonal_traits)
    
    #TODO: write something to redo the last method in case 
    # it does not output a single word
    
    return astro_sign


person = 'Donald Trump'
predicted_astro_sign(person)
    

part1  Donald Trump, as a public figure, can be analyzed based on his actions, speeches, policies, and personal behaviors during his tenure as President of the United States (2017-2021) and before that in his business career. It is important to remember that interpretations of personality and motivations can vary greatly, especially when dealing with high-profile individuals like Trump.

**Motivations:**
1. **Power and Wealth**: Trump has consistently expressed a desire for power and wealth throughout his career. This motivation can be seen in his business ventures, political campaigns, and time in the White House. He often emphasized making "America Great Again" as a means to increase U.S. power and influence on the global stage.
2. **Recognition and Admiration**: Trump seems to be driven by a need for recognition and admiration. He often seeks validation through media attention, social media engagement, and poll numbers. This motivation can be both healthy (a drive to succeed) and un

' Aries'

In [15]:
print(local_mistral("What you think of astrology?"))

 As a person who values evidence-based knowledge, I believe that while astrology has a long history and cultural significance for many people around the world, it lacks empirical scientific validation. Astrology is not supported by peer-reviewed research or consistent with our current understanding of astronomy and physics. However, I respect that people find comfort and personal meaning in astrology, and it's essential to preserve its cultural heritage while promoting critical thinking and evidence-based decision-making.


# Scraping Astro websites



In [3]:
import pandas as pd
from bs4 import BeautifulSoup
import requests

astrological_signs = [
    "Aries",
    "Taurus",
    "Gemini",
    "Cancer",
    "Leo",
    "Virgo",
    "Libra",
    "Scorpio",
    "Sagittarius",
    "Capricorn",
    "Aquarius",
    "Pisces"
]


# List to store data
data = []

for sign in astrological_signs:
    source = requests.get(f"https://astro-charts.com/persons/top/{sign.lower()}/").text
    soup = BeautifulSoup(source, 'html')
    for match in soup.find_all('div', class_="celeb-info"):
        name = match.find_all('p')[1].text
        data.append({"Name": name, "Sign": sign})

        # Create DataFrame
df = pd.DataFrame(data)

# Display DataFrame
print(df.head())

# Save DataFrame to a CSV file
# df.to_csv('astrological_signs.csv', index=False)

            Name   Sign
0  Matthew Healy  Aries
1      Lady Gaga  Aries
2  Conan O'Brien  Aries
3          Quavo  Aries
4   Mariah Carey  Aries


In [82]:
# Initialize the new column with default values (e.g., None)
df['Predicted'] = None

# Loop through the DataFrame using basic indexing
for i in range(5):
    name = df.iloc[i, 0]  # Get the name from the first column
    predicted_result = predicted_astro_sign(name)
    df.iloc[i, 2] = predicted_result  # Set the predicted result in the third column

# Display the updated DataFrame
print(df)

                   Name    Sign Predicted
0         Matthew Healy   Aries     Libra
1             Lady Gaga   Aries   Scorpio
2         Conan O'Brien   Aries    Gemini
3                 Quavo   Aries       Leo
4          Mariah Carey   Aries       Leo
...                 ...     ...       ...
1195      Ewan McGregor  Pisces      None
1196      Ty Dolla Sign  Pisces      None
1197        Gary Oldman  Pisces      None
1198       Julia Stiles  Pisces      None
1199  Matthew Broderick  Pisces      None

[1200 rows x 3 columns]


# Plotly plots

In [10]:
from dash import Dash, html, dcc, callback, Output, Input
import plotly.express as px
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/gapminder_unfiltered.csv')

app = Dash()

app.layout = [
    html.H1(children='Title of Dash App', style={'textAlign':'center'}),
    dcc.Dropdown(df.country.unique(), 'Canada', id='dropdown-selection'),
    dcc.Graph(id='graph-content')
]

@callback(
    Output('graph-content', 'figure'),
    Input('dropdown-selection', 'value')
)
def update_graph(value):
    dff = df[df.country==value]
    return px.line(dff, x='year', y='pop')

if __name__ == '__main__':
    app.run(debug=True)


NoLayoutException: Layout must be a dash component or a function that returns a dash component.