# CryptoGPT: Crypto Twitter Sentiment Analysis

## Project Setup

We'll use Python 3.11.3 for this project, and the directory structure will be as follows:

In [None]:
.
├── .flake8
├── .gitignore
├── .python-version
├── .vscode
│   └── settings.json
├── main.py
├── requirements.txt
└── sentiment_analyzer.py

### Libraries

In [None]:
!pip install -U pip
!pip install black isort langchain openai pandas plotly
!pip install https://github.com/mahrtayyab/tweety/archive/main.zip --upgrade

### Config

In [None]:
# .vscode/settings.json
{
  "python.formatting.provider": "black",
  "[python]": {
    "editor.formatOnSave": true,
    "editor.codeActionsOnSave": {
      "source.organizeImports": true
    }
  },
  "isort.args": ["--profile", "black"]
}

In [None]:
# .flake8
[flake8]
max-line-length = 120

### Streamlit

Streamlit is an open-source Python library designed for building custom web applications with ease. It allows us to create interactive and visually appealing data-driven applications using Python. With Streamlit, we can quickly transform our data analysis code into shareable web applications, making it ideal for our sentiment analysis project. Let's leverage the power of Streamlit to create a seamless and user-friendly interface for analyzing the sentiment of cryptocurrency tweets.

## Get Tweets

To fetch tweets for our analysis, we'll make use of the tweety2 library. This library interacts with Twitter's frontend API to retrieve the desired tweets:

We'll use black and isort for formatting and import sorting. Additionally, we'll configure VSCode for the project:

In [9]:


from tweety import Twitter

twitter_client = Twitter("session")
tweets = twitter_client.get_tweets("elonmusk")
# for tweet in tweets:
#     print(tweet.text)
#     print()

import re


def clean_tweet(text: str) -> str:
    text = re.sub(r"http\S+", "", text)
    text = re.sub(r"www.\S+", "", text)
    return re.sub(r"\s+", " ", text)

from datetime import datetime
from typing import Dict, List

import pandas as pd
from tweety.types import Tweet


def create_dataframe_from_tweets(tweets: List[Tweet]) -> pd.DataFrame:
    rows = []
    for tweet in tweets:
        clean_text = clean_tweet(tweet.text)
        if len(clean_text) == 0:
            continue
        rows.append(
            {
                "id": tweet.id,
                "text": clean_text,
                "author": tweet.author.username,
                "date": str(tweet.date.date()),
                "created_at": tweet.date,
                "views": tweet.views,
            }
        )
 
    df = pd.DataFrame(
        rows,
        columns=["id", "text", "author", "date", "views", "created_at"]
    )
    df.set_index("id", inplace=True)
    if df.empty:
        return df
    today = datetime.now().date()
    # df = df[
    #     df.created_at.dt.date > today - pd.to_timedelta("7day")
    # ]
    return df.sort_values(by="created_at", ascending=False)

df = create_dataframe_from_tweets(tweets)
df.head(10)

Unnamed: 0_level_0,text,author,date,views,created_at
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1791689030641578286,Ice cream is an amazing invention,elonmusk,2024-05-18,86507244,2024-05-18 04:35:27+00:00
1791636698318745634,Falcon going to orbit as seen from ocean,elonmusk,2024-05-18,84232753,2024-05-18 01:07:30+00:00
1778833124643983615,To an exciting & inspiring future!,elonmusk,2024-04-12,104631132,2024-04-12 17:10:40+00:00
1765047740327702665,"If you’re reading this post, it’s because our ...",elonmusk,2024-03-05,105358570,2024-03-05 16:12:28+00:00
1748036927821942798,So hot rn,elonmusk,2024-01-18,121492475,2024-01-18 17:37:35+00:00
1718192456838107162,Oh the Irany …,elonmusk,2023-10-28,110247656,2023-10-28 09:06:18+00:00
1707915765977055584,Hip-firing my Barrett 50 cal,elonmusk,2023-09-30,94050848,2023-09-30 00:30:24+00:00
1688022163574439937,If you were unfairly treated by your employer ...,elonmusk,2023-08-06,142615227,2023-08-06 03:00:20+00:00
1686058966705487875,"Wow, I’m glad so many people love Canada too 🤗",elonmusk,2023-07-31,68744363,2023-07-31 16:59:17+00:00
1686050455468621831,I ♥️ Canada,elonmusk,2023-07-31,142566839,2023-07-31 16:25:28+00:00


In [5]:
!pip install https://github.com/mahrtayyab/tweety/archive/main.zip --upgrade 

Collecting https://github.com/mahrtayyab/tweety/archive/main.zip
  Using cached https://github.com/mahrtayyab/tweety/archive/main.zip
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Collecting openpyxl (from tweety-ns==1.1.7)
  Using cached openpyxl-3.1.4-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting httpx (from tweety-ns==1.1.7)
  Using cached httpx-0.27.0-py3-none-any.whl.metadata (7.2 kB)
Collecting dateutils (from tweety-ns==1.1.7)
  Using cached dateutils-0.6.12-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting anyio (from httpx->tweety-ns==1.1.7)
  Using cached anyio-4.4.0-py3-none-any.whl.metadata (4.6 kB)
Collecting httpcore==1.* (from httpx->tweety-ns==1.1.7)
  Using cached httpcore-1.0.5-py3-non