# Crypto Gainers Analysis

This project involves scraping data from a website to gather information about the top cryptocurrency gainers, processing this data to extract meaningful insights, and summarizing the occurrences of each cryptocurrency across different months.

# Setup

First, we need to install the necessary packages. Run the following cell to install the required libraries.



In [None]:
# Install required packages
!pip install requests
!pip install beautifulsoup4
!pip install pandas
!pip install matplotlib




## Web Scraping

The following code scrapes data from the website and saves it into a CSV file.


In [None]:
import requests
from bs4 import BeautifulSoup
import csv

# Base URL and search query
base_url = "https://insidebitcoins.com/page/"
search_query = "?s=Top+Crypto+Gainers"

# Total number of pages
total_pages = 22

# Open a CSV file to store the data
with open("top_crypto_gainers.csv", "w", newline='', encoding="utf-8") as csvfile:
    # Create a CSV writer object
    writer = csv.writer(csvfile)

    # Write header row
    writer.writerow(['Page', 'Title'])

    # Loop through each page
    for page_num in range(1, total_pages + 1):
        # Construct the URL for the current page
        url = f"{base_url}{page_num}{search_query}"

        # Send a GET request to the URL
        response = requests.get(url)

        # Parse HTML
        soup = BeautifulSoup(response.text, 'html.parser')

        # Find all the article titles containing the top crypto gainers
        article_titles = soup.find_all('a', class_='article-header-title')

        # Write the titles to the CSV file
        for title in article_titles:
            writer.writerow([page_num, title.text.strip()])

print("Data has been saved to top_crypto_gainers.csv.")


Data has been saved to top_crypto_gainers.csv.


## Data Processing

Next, we process the scraped data to count the occurrences of each cryptocurrency by month and save the summary to a new CSV file.


In [None]:
import pandas as pd

# Load the original CSV file
df = pd.read_csv('top_crypto_gainers.csv')

# Initialize a dictionary to store data for each page
page_data = {}

# Iterate over each row in the CSV
for index, row in df.iterrows():
    page = int(row['Page'])
    title = row['Title']

    # Extract month, day, and cryptocurrencies
    split_title = title.split('Today')
    if len(split_title) > 1:  # Check if the title contains "Today"
        month_day = split_title[1].strip().split()  # Extract month and day
        month = month_day[0]  # Extract month
        day = month_day[1]  # Extract day
        cryptos = [crypto.strip() for crypto in split_title[1].split('â€“')[1].split(',')]  # Extract cryptocurrencies

        # Update page data dictionary
        if page not in page_data:
            page_data[page] = {'Month': month, 'Day': day, 'Cryptocurrencies': cryptos}
        else:
            page_data[page]['Cryptocurrencies'].extend(cryptos)

# Write the extracted data into a new CSV file
with open('top_crypto_gainers_processed.csv', 'w', newline='') as csvfile:
    fieldnames = ['Page', 'Month', 'Day', 'Cryptocurrency']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()

    # Write data for each page
    for page, data in page_data.items():
        month = data['Month']
        day = data['Day']
        cryptos = data['Cryptocurrencies']
        for crypto in cryptos:
            writer.writerow({'Page': page, 'Month': month, 'Day': day, 'Cryptocurrency': crypto})

print("Data saved successfully to 'top_crypto_gainers_processed.csv'.")


## Summarizing Data

We will now summarize the data to count the occurrences of each cryptocurrency across different months.


In [None]:
# Load the processed CSV file into a DataFrame
df_processed = pd.read_csv("top_crypto_gainers_processed.csv")

# Create a pivot table to count the occurrences of each cryptocurrency in each month
pivot_table = df_processed.pivot_table(index='Cryptocurrency', columns='Month', aggfunc='size', fill_value=0)

# Calculate the total count of occurrences for each cryptocurrency across all months
total_count = pivot_table.sum(axis=1)

# Get the list of months
months = pivot_table.columns.tolist()

# Initialize a dictionary to store the months in which each cryptocurrency appeared
months_dict = {}

# Iterate over each cryptocurrency
for crypto in pivot_table.index:
    # Get the months in which the cryptocurrency appeared
    appeared_months = [months[i] for i in range(len(months)) if pivot_table.loc[crypto, months[i]] > 0]
    # Store the months in the dictionary
    months_dict[crypto] = appeared_months

# Create a DataFrame to display the total count of occurrences and the months
result_df = pd.DataFrame({'Total Count': total_count, 'Appeared Months': months_dict})

# Sort the DataFrame by the "Total Count" column in descending order
result_df_sorted = result_df.sort_values(by='Total Count', ascending=False)

# Save the result to a new CSV file
result_df_sorted[['Total Count', 'Appeared Months']].to_csv("Cryptocurrency_Gainers_Summary_Generator.csv", index_label='Cryptocurrency')

print("Data has been saved to 'Cryptocurrency_Gainers_Summary_Generator.csv'.")


Data has been saved to 'Cryptocurrency_Gainers_Summary_Generator.csv'.


## Insights


In [None]:
from tabulate import tabulate
import pandas as pd

# Load the processed CSV file into a DataFrame
top_crypto_gainers_processed = pd.read_csv("top_crypto_gainers_processed.csv")

# Insight 1: Monthly Distribution of Top Crypto Gainers
monthly_distribution = top_crypto_gainers_processed.groupby(['Month']).size().reset_index(name='Count')
insight_1_data = monthly_distribution.values.tolist()

# Insight 2: Top Gainers by Month
top_gainers_by_month = top_crypto_gainers_processed.groupby(['Month'])['Cryptocurrency'].agg(lambda x: x.value_counts().index[0]).reset_index(name='Top Gainer')
insight_2_data = top_gainers_by_month.values.tolist()

# Insight 3: Top Gainers Overall
top_gainers_overall = top_crypto_gainers_processed['Cryptocurrency'].value_counts().reset_index(name='Count').head(5)
insight_3_data = top_gainers_overall.values.tolist()

# Display insights as tables
print("Insight 1: Monthly Distribution of Top Crypto Gainers")
print(tabulate(insight_1_data, headers=["Month", "Count"]))
print()

print("Insight 2: Top Gainers by Month")
print(tabulate(insight_2_data, headers=["Month", "Top Gainer"]))
print()

print("Insight 3: Top Gainers Overall")
print(tabulate(insight_3_data, headers=["Cryptocurrency", "Count"]))
print()

# Export insights to CSV
insights_data = {
    "Monthly Distribution": monthly_distribution,
    "Top Gainers by Month": top_gainers_by_month,
    "Top Gainers Overall": top_gainers_overall
}

for insight_name, insight_df in insights_data.items():
    insight_df.to_csv(f"{insight_name}_Insight.csv", index=False)
    print(f"{insight_name} data has been saved to '{insight_name}_Insight.csv'")


Insight 1: Monthly Distribution of Top Crypto Gainers
Month      Count
-------  -------
Apr          119
Dec          120
Feb           79
Jan          120
Mar           79
March         39
May          118
Nov          108

Insight 2: Top Gainers by Month
Month    Top Gainer
-------  -------------
Apr      Akash Network
Dec      Bonk
Feb      Immutable
Jan      ORDI
Mar      Toncoin
March    Shiba Inu
May      Bitcoin Gold
Nov      Ark

Insight 3: Top Gainers Overall
Cryptocurrency      Count
----------------  -------
Stacks                 14
Akash Network          13
Axelar                 12
Optimism               12
Celestia               12

Monthly Distribution data has been saved to 'Monthly Distribution_Insight.csv'
Top Gainers by Month data has been saved to 'Top Gainers by Month_Insight.csv'
Top Gainers Overall data has been saved to 'Top Gainers Overall_Insight.csv'


## Visualization

Finally, we visualize the top 10 cryptocurrencies by the total count of occurrences.


In [None]:
import matplotlib.pyplot as plt

# Load the processed CSV file
summary_df = pd.read_csv("Cryptocurrency_Gainers_Summary_Generator.csv")

# Get the top 10 cryptocurrencies by total count
top_10 = summary_df.nlargest(10, 'Total Count')

# Plot the top 10 cryptocurrencies
plt.figure(figsize=(12, 8))
plt.barh(top_10['Cryptocurrency'], top_10['Total Count'], color='skyblue')
plt.xlabel('Total Count')
plt.ylabel('Cryptocurrency')
plt.title('Top 10 Cryptocurrencies by Total Count of Appearances')
plt.gca().invert_yaxis()
plt.show()


## Conclusion

This project demonstrates web scraping, data processing, and visualization skills using Python. The steps include collecting data from a website, processing it to extract useful insights, and visualizing the results. This comprehensive analysis helps in understanding the trends and popularity of various cryptocurrencies over time.
