<a href="https://colab.research.google.com/github/riccardoscut/riccardoscut/blob/main/CrypToData.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Creating a simple project to retrieve, clean, manipulate, and analyze cryptocurrency data using Python and pandas can be a fun way to get hands-on experience with data manipulation and analysis. Here's a step-by-step guide on how to approach this project.


1. Set Up Your Environment
Ensure you have the necessary packages installed. You'll need the following:

2. Data Retrieval
You can retrieve cryptocurrency data using public APIs like the CoinGecko API or the CoinMarketCap API. CoinGecko doesn’t require authentication, so it’s a simpler starting point.


In [None]:

pip install pandas requests matplotlib

Importiamo i moduli

In [None]:
import requests
import pandas as pd

Definiamo la funzione get_crypto_data da questo sito: "https://api.coingecko.com/api/v3/coins/{crypto_id}/market_chart" sui 30 giorni in dollari per il prezzo.

In [None]:
def get_crypto_data(crypto_id, days='365'):
  url = f"https://api.coingecko.com/api/v3/coins/{crypto_id}/market_chart"
  params = {
      'vs_currency': 'usd',
      'days': days,
      'interval': 'daily'
      }

  response = requests.get(url, params=params)

  if response.status_code == 200:
    data = response.json()
    # Convert price data into a DataFrame
    prices = pd.DataFrame(data['prices'], columns=['timestamp', 'price'])
    # Convert timestamp to a readable date format
    prices['timestamp'] = pd.to_datetime(prices['timestamp'], unit='ms')
    return prices
  else:
    print(f"Failed to retrieve data: {response.status_code}")
    return None

# Example: Get 1 year of Cardano data
ada_data = get_crypto_data('cardano')

3. Data Cleaning
Once you have the data, you might want to clean it up. Some common cleaning tasks include:
Removing duplicates. Handling missing values. Formatting timestamps.


In [None]:
print(ada_data)


In [None]:
# Formatting timestamps
ada_data['timestamp'] = pd.to_datetime(ada_data['timestamp'])

# Removing duplicates
ada_data.drop_duplicates(subset=['timestamp'])

# Handling missing values
#ada_data.fillna(method='ffill', inplace=True)

#Resetting the index after cleaning
ada_data.reset_index(drop=True, inplace=True)

print(ada_data)


Calculate daily return


In [None]:
ada_data['daily_return'] = ada_data['price'].pct_change()
print(ada_data['daily_return'])

Calculate 7 days moving average

In [None]:
ada_data['7_day_MA'] = ada_data['price'].rolling(window=7).mean()
print(ada_data['7_day_MA'])

Find higest price

In [None]:
highest_price = ada_data['price'].max()
print(highest_price)

6. Basic Analysis and Visualization
You can perform some basic analysis and visualize the data to understand trends. For example, plot the price and moving average over time.

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.figure(figsize=(12, 6))
plt.plot(ada_data['timestamp'], ada_data['price'], label='Price', color='blue')
plt.plot(ada_data['timestamp'], ada_data['7_day_MA'], label='7-Day MA', color='red')
plt.title('Cardano Price Over Time')
plt.xlabel('Date')
plt.ylabel('Price (USD)')
plt.legend()
plt.grid(True)
plt.show()

Saving your data

In [None]:
ada_data.to_csv('cardano_cleaned_data.csv', index=False)

Carica i dati:
Importa il dataset di prezzo di Cardano che hai ottenuto tramite l’API di CoinGecko.
Scarica e carica anche il dataset di notizie dal link che hai fornito. Questo dataset include articoli e notizie legate alle criptovalute, con timestamp che indicano quando è stata pubblicata ogni notizia.

In [None]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

# Carica il dataset di prezzi di Cardano

In [None]:
cardano_data = pd.read_csv('cardano_cleaned_data.csv')
cardano_data['timestamp'] = pd.to_datetime(cardano_data['timestamp'])
print(cardano_data)

mi sono copiato il repo cryptoNewsDataset in locale e adesso devo estrarre il .rar per lavorare sui .csv

In [None]:
from google.colab import files
files.download('cardano_cleaned_data.csv')


In [None]:
import pandas as pd

# Carica i dati di prezzo
price_df = pd.read_csv('path_to_your_cardano_price_data.csv')  # Sostituisci con il tuo file CSV
price_df['date'] = pd.to_datetime(price_df['date'])  # Assicurati che la colonna delle date sia nel formato corretto

# Carica il dataset delle notizie
news_df = pd.read_csv('path_to_your_cryptoNewsDataset/cryptopanic_news.csv')  # Sostituisci con il tuo file CSV
news_df['newsDatetime'] = pd.to_datetime(news_df['newsDatetime'])  # Assicurati che la colonna delle date sia nel formato corretto

# Identifica cambiamenti significativi di prezzo (es. > 5%)
price_df['price_change'] = price_df['close'].pct_change()  # Calcola la variazione percentuale
significant_changes = price_df[price_df['price_change'].abs() > 0.05]  # Cambiamenti > 5%

# Unisci i dati delle notizie con i cambiamenti significativi di prezzo
merged_df = pd.merge(significant_changes, news_df, left_on='date', right_on='newsDatetime', how='left')

# Analizza le correlazioni
print(merged_df[['date', 'close', 'price_change', 'title']])


In [None]:
import pandas as pd

# Percorso ai file CSV su locale
base_path = '/'

# Carica i file CSV in DataFrame
cryptopanic_news = pd.read_csv(base_path + 'cryptopanic_news.csv')
currency = pd.read_csv(base_path + 'currency.csv')
news_currency = pd.read_csv(base_path + 'news__currency.csv')
news_currencies_source_joinedResult = pd.read_csv(base_path + 'news_currencies_source_joinedResult.csv')
source = pd.read_csv(base_path + 'source.csv')

# Visualizza le prime righe di ciascun DataFrame per verificare
print("Cryptopanic News:")
print(cryptopanic_news.head())
print("\nCurrency:")
print(currency.head())
print("\nNews Currency:")
print(news_currency.head())
print("\nNews Currencies Source Joined Result:")
print(news_currencies_source_joinedResult.head())
print("\nSource:")
print(source.head())
