# Algotrade Sentiment with Messari

**Description**: This example demonstrates how to use the Messari API to retrieve news and market data for a list of assets, and then use the news to generate sentiment scores for each asset.
 
 **Prerequisites**
 - Python 3.8 or higher
 - A Messari API key (get one at https://messari.io)
 
**Setup Steps**
 1. Install the required packages:
    ```bash
    pip install aiohttp==3.11.4 pandas==2.2.3 requests==2.32.3 tqdm==4.67.0 python-dotenv==1.0.1 plotly==5.24.1 nbformat>=4.2.0
    ```
 
 2. Configure your API key by creating a `.env` file in the project root folder:
    ```bash
    cp .env.template .env # Remember to update the .env file with your API key
    ```

This code is adapted from the `01_fetch_data.py` and `02_metrics_and_charting.py` script in the `cookbooks/examples/algotrade_sentiment` directory. Some of the code has been modified to run in a Jupyter Notebook environment.

In [None]:
!pip install aiohttp==3.11.4 pandas==2.2.3 requests==2.32.3 tqdm==4.67.0 python-dotenv==1.0.1 plotly==5.24.1 nbformat>=4.2.0

In [31]:
from dotenv import load_dotenv
import warnings
import pandas as pd

load_dotenv()  # This loads the .env file from your project root
warnings.filterwarnings("ignore", category=FutureWarning)
warnings.filterwarnings("ignore", category=pd.errors.SettingWithCopyWarning)

# `01_fetch_data.py`

In [2]:
import asyncio
import datetime
import os

import aiohttp
import numpy as np
import pandas as pd
import requests
from tqdm.asyncio import tqdm
import nest_asyncio

########################
# 0. SETUP & CONFIGURATION
########################

# Variables
MESSARI_API_KEY = os.getenv("MESSARI_API_KEY")
LOOKBACK_DAYS = 28  # Number of days to lookback for news and market data. Increase this to get more historical data.
OUTPUT_FILEPATH = "output/data.csv"
SEM_VALUE = 10

# Constants
MESSARI_BASE_URL = "https://api.messari.io"
NEWS_API_URL = f"{MESSARI_BASE_URL}/news/v1/news/feed"
MARKET_DATA_BASE_API_URL = f"{MESSARI_BASE_URL}/marketdata/v1/assets"
HEADERS = {"accept": "application/json", "x-messari-api-key": MESSARI_API_KEY}

SECONDS_IN_A_DAY = 86400
END_TIMESTAMP_S = int(datetime.datetime.now().timestamp())
START_TIMESTAMP_S = END_TIMESTAMP_S - (SECONDS_IN_A_DAY * LOOKBACK_DAYS)
END_TIMESTAMP_MS = END_TIMESTAMP_S * 1000
START_TIMESTAMP_MS = START_TIMESTAMP_S * 1000

print(
    f"⏰ Time window:\n"
    f"  Start: {datetime.datetime.fromtimestamp(START_TIMESTAMP_S).strftime('%Y-%m-%d %H:%M:%S')}\n"
    f"  End:   {datetime.datetime.fromtimestamp(END_TIMESTAMP_S).strftime('%Y-%m-%d %H:%M:%S')}"
)


if not MESSARI_API_KEY:
    raise ValueError(
        "MESSARI_API_KEY environment variable is not set. Please create a .env file in the root directory. You can find an example in .env.template"
    )

⏰ Time window:
  Start: 2024-10-23 16:26:50
  End:   2024-11-20 16:26:50


## 1. FUNCTIONS USED FOR DATA RETRIEVAL AND PROCESSING

### 1.1 News

Fetch news articles from Messari's Feed API endpoint within the specified time range.
Documentation: https://docs.messari.io/reference/feed

### 1.2 Market Data

Fetch market data for a list of assets from the Messari API.
Documentation: https://docs.messari.io/reference/timeseries-by-asset-id

### 1.3 Utils

In [3]:
########################
# 1. NEWS DATA FUNCTIONS
########################
async def fetch_single_news_page(
    session: aiohttp.ClientSession,
    sem: asyncio.Semaphore,
    page: int,
    start_timestamp: int,
    end_timestamp: int,
) -> pd.DataFrame:
    """Fetches a single page of news from Messari API"""
    params = {
        "sort": 1,
        "publishedBefore": end_timestamp,
        "publishedAfter": start_timestamp,
        "limit": 100,
        "page": page,
    }

    async with sem:
        await asyncio.sleep(1)  # Rate limiting
        async with session.get(
            NEWS_API_URL, headers=HEADERS, params=params
        ) as response:
            return pd.DataFrame((await response.json())["data"])


async def fetch_all_news_pages(
    pages: list[int], sem_value: int, start_timestamp: int, end_timestamp: int
) -> pd.DataFrame:
    """Fetches all news pages concurrently"""
    sem = asyncio.Semaphore(sem_value)
    async with aiohttp.ClientSession() as session:
        tasks = [
            fetch_single_news_page(session, sem, page, start_timestamp, end_timestamp)
            for page in pages
        ]
        results = await tqdm.gather(*tasks, desc="Fetching news pages")
    return pd.concat(results)


def process_news_data(
    news_df: pd.DataFrame,
) -> tuple[pd.DataFrame, np.ndarray, np.ndarray]:
    """Process raw news data into required format"""
    # Convert timestamp and clean unnecessary columns
    news_df["date"] = pd.to_datetime(news_df["publishTimeMillis"], unit="ms").dt.date
    news_df = news_df.drop(columns=["publishTimeMillis"])

    # Filter and process asset-related news
    news_df = news_df[news_df["assets"].notna()].explode("assets")

    # Extract asset information
    news_df["asset_id"] = news_df["assets"].apply(lambda x: x["id"])
    news_df["asset_name"] = news_df["assets"].apply(lambda x: x["name"])
    news_df = news_df.drop(columns=["assets"])

    # Clean text fields
    for col in ["title", "description", "url"]:
        news_df[col] = news_df[col].fillna("None")

    # Group by asset and date
    grouped_df = news_df.groupby(["asset_id", "asset_name", "date"]).agg(
        {"title": list, "description": list, "url": list}
    )
    grouped_df["count"] = grouped_df["title"].apply(len)

    return (
        grouped_df.reset_index(),
        news_df["asset_id"].unique(),
        news_df["date"].unique(),
    )


########################
# 2. MARKET DATA FUNCTIONS
########################
async def fetch_single_asset_market_data(
    session: aiohttp.ClientSession,
    sem: asyncio.Semaphore,
    asset_id: str,
    start_timestamp: int,
    end_timestamp: int,
) -> pd.DataFrame:
    """Fetches market data for a single asset"""
    url = f"{MARKET_DATA_BASE_API_URL}/{asset_id}/price/time-series"
    params = {"interval": "1d", "startTime": start_timestamp, "endTime": end_timestamp}

    async with sem:
        await asyncio.sleep(1)  # Rate limiting
        async with session.get(url, headers=HEADERS, params=params) as response:
            data = await response.json()
            if not data:
                return pd.DataFrame()

            df = pd.DataFrame(data["data"])
            df["asset_id"] = asset_id
            return df


async def fetch_all_market_data(
    asset_ids: np.ndarray, sem_value: int, start_timestamp: int, end_timestamp: int
) -> pd.DataFrame:
    """Fetches market data for all assets concurrently"""
    sem = asyncio.Semaphore(sem_value)
    async with aiohttp.ClientSession() as session:
        tasks = [
            fetch_single_asset_market_data(
                session, sem, asset_id, start_timestamp, end_timestamp
            )
            for asset_id in asset_ids
        ]
        results = await tqdm.gather(*tasks, desc="Fetching market data")

    df = pd.concat(results)
    df["date"] = pd.to_datetime(df["timestamp"], unit="s").dt.date
    return df[["date", "asset_id", "close"]]


########################
# 3. UTILS
########################
def ensure_output_directory(fp: str):
    os.makedirs(os.path.dirname(fp), exist_ok=True)

# Main Execution

This is similar to the `main` function in the `01_fetch_data.py` script.

In [4]:
########################
# 4. MAIN EXECUTION
########################

# 1. Fetch News Data
# Get total pages from initial request
response = requests.get(
    NEWS_API_URL,
    headers=HEADERS,
    params={
        "sort": 1,
        "publishedBefore": END_TIMESTAMP_MS,
        "publishedAfter": START_TIMESTAMP_MS,
        "limit": 100,
        "page": 1,
    },
).json()
total_pages = response["metadata"]["totalPages"]

# Fetch and process all news
nest_asyncio.apply()
loop = asyncio.get_event_loop()
news_df = loop.run_until_complete(
    fetch_all_news_pages(
        pages=range(1, total_pages + 1),
        sem_value=SEM_VALUE,
        start_timestamp=START_TIMESTAMP_MS,
        end_timestamp=END_TIMESTAMP_MS,
    )
)
processed_news_df, unique_asset_ids, unique_dates = process_news_data(news_df)
processed_news_df.head()

Fetching news pages: 100%|██████████| 131/131 [00:40<00:00,  3.25it/s]


Unnamed: 0,asset_id,asset_name,date,title,description,url,count
0,0018b9a5-f16d-440b-a617-a7c52ae2fc42,Voltage,2024-10-29,"[Lessons Learned, Future Defined — A Community...",[None],[https://medium.com/@voltage.finance/lessons-l...,1
1,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-10-23,[Web3Auth is now integrated with Zilliqa EVM],[None],[https://blog.zilliqa.com/web3auth-is-now-inte...,1
2,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-10-28,[Jasper proto-testnet showcases Zilliqa 2.0 pe...,[None],[https://blog.zilliqa.com/jasper-proto-testnet...,1
3,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-10-30,[Zilliqa’s Mission to Unlock Web3 for All],[None],[https://blog.zilliqa.com/zilliqas-mission-to-...,1
4,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-11-01,[Zilliqa Monthly Newsletter - October 2024],[None],[https://blog.zilliqa.com/zilliqa-monthly-news...,1


In [5]:
print("Number of unique asset IDs:", len(unique_asset_ids))
print("Sample of first 5 asset IDs:")
print(unique_asset_ids[:5])
print("\nNumber of unique dates:", len(unique_dates))
print("Sample of first 5 dates:")
print(unique_dates[:5])

Number of unique asset IDs: 2153
Sample of first 5 asset IDs:
['2db6b38a-681a-4514-9d67-691e319597ee'
 '977dd0e0-b9c3-4a21-89a7-6762c964c138'
 'e0a59438-233e-4d36-b5e2-93834165c3db'
 '157f4fe3-6046-4b6d-bceb-a2af8ca021b5'
 '1d51479d-68f6-4886-8644-2a55ea9007bf']

Number of unique dates: 29
Sample of first 5 dates:
[datetime.date(2024, 10, 23) datetime.date(2024, 10, 24)
 datetime.date(2024, 10, 25) datetime.date(2024, 10, 26)
 datetime.date(2024, 10, 27)]


In [6]:
# 2. Fetch Market Data
market_df = loop.run_until_complete(
    fetch_all_market_data(
        asset_ids=unique_asset_ids,
        sem_value=SEM_VALUE,
        start_timestamp=START_TIMESTAMP_S,
        end_timestamp=END_TIMESTAMP_S,
    )
)

# 3. Merge and Export
final_df = pd.merge(processed_news_df, market_df, on=["asset_id", "date"], how="left")
ensure_output_directory(OUTPUT_FILEPATH)
final_df.to_csv(OUTPUT_FILEPATH, index=False)
print(f"Data exported to {OUTPUT_FILEPATH}")

Fetching market data:   0%|          | 0/2153 [00:00<?, ?it/s]

Fetching market data: 100%|██████████| 2153/2153 [06:56<00:00,  5.17it/s]


Data exported to output/data.csv


In [9]:
market_df.head()

Unnamed: 0,date,asset_id,close
0,2024-10-24,2db6b38a-681a-4514-9d67-691e319597ee,26.824577
1,2024-10-25,2db6b38a-681a-4514-9d67-691e319597ee,24.895794
2,2024-10-26,2db6b38a-681a-4514-9d67-691e319597ee,25.377741
3,2024-10-27,2db6b38a-681a-4514-9d67-691e319597ee,25.749116
4,2024-10-28,2db6b38a-681a-4514-9d67-691e319597ee,26.242908


In [10]:
final_df.head()

Unnamed: 0,asset_id,asset_name,date,title,description,url,count,close
0,0018b9a5-f16d-440b-a617-a7c52ae2fc42,Voltage,2024-10-29,"[Lessons Learned, Future Defined — A Community...",[None],[https://medium.com/@voltage.finance/lessons-l...,1,
1,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-10-23,[Web3Auth is now integrated with Zilliqa EVM],[None],[https://blog.zilliqa.com/web3auth-is-now-inte...,1,
2,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-10-28,[Jasper proto-testnet showcases Zilliqa 2.0 pe...,[None],[https://blog.zilliqa.com/jasper-proto-testnet...,1,0.014205
3,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-10-30,[Zilliqa’s Mission to Unlock Web3 for All],[None],[https://blog.zilliqa.com/zilliqas-mission-to-...,1,0.014648
4,007fe708-6ace-499b-a999-dcaa45e0eaf5,Zilliqa,2024-11-01,[Zilliqa Monthly Newsletter - October 2024],[None],[https://blog.zilliqa.com/zilliqa-monthly-news...,1,0.013911


# `02_metrics_and_charting.py`

In the next section, we will calculate a few metrics such as trend_score, acceleration_score etc. and plot the momentum score vs price.

In [27]:
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots

########################
# 0. SETUP & CONFIGURATION
########################

# Variables
INPUT_FILEPATH = "output/data.csv"
OUTPUT_METRICS_FILEPATH = "output/data_with_metrics.csv"
OUTPUT_PLOT_FILEPATH = "output/momentum_vs_price.html"
DEFAULT_ASSET_ID = "b3d5d66c-26a2-404c-9325-91dc714a722b"  # Solana

# Constants
ROLLING_WINDOW = 7  # Days for rolling average calculations
MIN_PERIODS = 1  # Minimum periods for rolling calculations


def load_and_prepare_data():
    """Load data from CSV and prepare initial dataframe"""
    df = pd.read_csv(INPUT_FILEPATH)
    df["date"] = pd.to_datetime(df["date"])
    return df.sort_values(["asset_id", "date"])


def calculate_metrics(df):
    """Calculate momentum-related metrics for each asset"""
    # Calculate rolling 7-day average count
    df["count_7d_avg"] = df.groupby("asset_id")["count"].transform(
        lambda x: x.rolling(window=ROLLING_WINDOW, min_periods=MIN_PERIODS).mean()
    )

    # Calculate trend score (7-day change in rolling average)
    df["trend_score"] = df.groupby("asset_id")["count_7d_avg"].transform(
        lambda x: x.pct_change(periods=ROLLING_WINDOW)
    )

    # Calculate acceleration score (1-day change in trend)
    df["acceleration_score"] = df.groupby("asset_id")["trend_score"].transform(
        lambda x: x.pct_change(periods=1)
    )

    # Calculate price score (7-day price change)
    df["price_score"] = df.groupby("asset_id")["close"].transform(
        lambda x: x.pct_change(periods=ROLLING_WINDOW)
    )

    return df


def clean_data(df):
    """Clean data by handling infinities and filtering rows"""
    # Replace infinities with NaN
    for col in ["trend_score", "acceleration_score", "price_score"]:
        df[col] = df[col].replace([float("inf"), float("-inf")], float("nan"))

    # Remove rows with NaN values
    df = df.dropna(subset=["close", "trend_score", "acceleration_score", "price_score"])

    # Calculate and filter momentum score
    df["momentum_score"] = (
        df["trend_score"] * df["acceleration_score"] * df["price_score"]
    )
    df = df[df["momentum_score"] > 0]
    df = df[df["acceleration_score"] > 0]

    return df.sort_values(["date", "momentum_score"], ascending=[False, False])


def plot_momentum_vs_price(df, asset_id):
    """Create a dual-axis plot comparing momentum score and price"""
    # Filter for specific asset
    asset_df = df[df["asset_id"] == asset_id]
    asset_name = asset_df["asset_name"].iloc[0]

    # Create figure with dual y-axes
    fig = make_subplots(specs=[[{"secondary_y": True}]])

    # Add momentum score trace
    fig.add_trace(
        go.Scatter(
            x=asset_df["date"], y=asset_df["momentum_score"], name="Momentum Score"
        ),
        secondary_y=False,
    )

    # Add price trace
    fig.add_trace(
        go.Scatter(x=asset_df["date"], y=asset_df["close"], name="Close Price"),
        secondary_y=True,
    )

    # Update layout
    fig.update_layout(title=f"{asset_name} Momentum Score and Close Price Over Time")
    fig.update_yaxes(title_text="Momentum Score", secondary_y=False)
    fig.update_yaxes(title_text="Close Price", secondary_y=True)

    return fig

In [28]:
df = load_and_prepare_data()
df = calculate_metrics(df)
df = clean_data(df)
df

Unnamed: 0,asset_id,asset_name,date,title,description,url,count,close,count_7d_avg,trend_score,acceleration_score,price_score,momentum_score
3221,6b4833f7-4671-4074-9de6-93d6c40bd739,Sky Protocol,2024-11-20,"['Coinbase to delist WBTC, halt trading on Dec...",['Coinbase will\xa0disable Wrapped Bitcoin (WB...,['https://cryptoslate.com/coinbase-to-delist-w...,1,0.060600,1.714286,0.224490,0.571429,0.299031,0.038360
1225,21c795f5-1bfd-40c3-858e-e9d7e820c6d0,Ethereum,2024-11-20,['Ethereum ETFs record sudden outflows: What c...,"['\n \tEthereum ETF inflows hit a high, but be...",['https://ambcrypto.com/ethereum-etfs-record-s...,3,3092.511819,46.285714,-0.349398,3.000602,-0.031112,0.032617
3781,7dc551ba-cfed-4437-a027-386044415e3e,BNB,2024-11-20,['BNB at $615 support level: Is this the calm ...,['\n \tBNB liquidation pools hit $2.75M at the...,['https://ambcrypto.com/bnb-at-615-support-lev...,1,610.112147,5.285714,-0.339286,2.053571,-0.018294,0.012746
6797,e979beec-e534-4f5f-a69d-851b64b38f5f,Jupiter,2024-11-20,['✨ Countdown to Jupuary 2025: Info & Discussi...,"['None', 'None', 'None', 'None', 'None', 'None...",['https://www.jupresear.ch/t/countdown-to-jupu...,11,1.096899,65.714286,-0.235880,0.205611,-0.059474,0.002884
778,17b91b7d-e10b-43be-b046-c170aecd0783,Hedera Hashgraph,2024-11-19,"['Bitcoin’s Price Choppy at $91K, PEPE Dumps b...","[""HBAR and XTZ are today's top performers from...",['https://cryptopotato.com/bitcoins-price-chop...,9,0.133216,3.857143,0.800000,0.866667,1.888619,1.309442
...,...,...,...,...,...,...,...,...,...,...,...,...,...
3593,78c4b0c5-8cbe-4f05-b639-0eb942e86dd5,Sui,2024-10-31,['First Digital Labs Expands FDUSD Stablecoin ...,[' First Digital Labs integrates FDUSD stab...,['https://en.coin-turk.com/first-digital-labs-...,5,1.969365,5.285714,0.510204,1.040816,0.017296,0.009185
996,1d51479d-68f6-4886-8644-2a55ea9007bf,Uniswap,2024-10-31,['Buy Crypto Using Venmo on Uniswap Web and Wa...,"['None', 'The commemorative ‘Investigations’ N...",['https://blog.uniswap.org/buy-crypto-using-ve...,6,7.606876,5.142857,-0.532468,0.138889,-0.053631,0.003966
4385,97775be0-2608-4720-b7af-f85b24c7eb2d,XRP,2024-10-31,['XRP Sees Potential Price Increase This Novem...,[' XRP has potential for price increases as...,['https://en.coin-turk.com/xrp-sees-potential-...,12,0.509055,10.428571,-0.546584,0.080307,-0.042719,0.001875
1205,21c795f5-1bfd-40c3-858e-e9d7e820c6d0,Ethereum,2024-10-31,"['The Devcon schedule is live!', 'Bitcoin and ...","['None', 'Crypto.com’s monthly spot trading vo...",['https://blog.ethereum.org/en/2024/10/31/devc...,92,2516.113112,74.428571,-0.203972,1.141711,-0.007301,0.001700


In [29]:
# Load and process data
df = load_and_prepare_data()
df = calculate_metrics(df)
df = clean_data(df)

# Save processed data
df.to_csv(OUTPUT_METRICS_FILEPATH, index=False)
print(f"Processed data saved to {OUTPUT_METRICS_FILEPATH}")

df.head()

Processed data saved to output/data_with_metrics.csv


Unnamed: 0,asset_id,asset_name,date,title,description,url,count,close,count_7d_avg,trend_score,acceleration_score,price_score,momentum_score
3221,6b4833f7-4671-4074-9de6-93d6c40bd739,Sky Protocol,2024-11-20,"['Coinbase to delist WBTC, halt trading on Dec...",['Coinbase will\xa0disable Wrapped Bitcoin (WB...,['https://cryptoslate.com/coinbase-to-delist-w...,1,0.0606,1.714286,0.22449,0.571429,0.299031,0.03836
1225,21c795f5-1bfd-40c3-858e-e9d7e820c6d0,Ethereum,2024-11-20,['Ethereum ETFs record sudden outflows: What c...,"['\n \tEthereum ETF inflows hit a high, but be...",['https://ambcrypto.com/ethereum-etfs-record-s...,3,3092.511819,46.285714,-0.349398,3.000602,-0.031112,0.032617
3781,7dc551ba-cfed-4437-a027-386044415e3e,BNB,2024-11-20,['BNB at $615 support level: Is this the calm ...,['\n \tBNB liquidation pools hit $2.75M at the...,['https://ambcrypto.com/bnb-at-615-support-lev...,1,610.112147,5.285714,-0.339286,2.053571,-0.018294,0.012746
6797,e979beec-e534-4f5f-a69d-851b64b38f5f,Jupiter,2024-11-20,['✨ Countdown to Jupuary 2025: Info & Discussi...,"['None', 'None', 'None', 'None', 'None', 'None...",['https://www.jupresear.ch/t/countdown-to-jupu...,11,1.096899,65.714286,-0.23588,0.205611,-0.059474,0.002884
778,17b91b7d-e10b-43be-b046-c170aecd0783,Hedera Hashgraph,2024-11-19,"['Bitcoin’s Price Choppy at $91K, PEPE Dumps b...","[""HBAR and XTZ are today's top performers from...",['https://cryptopotato.com/bitcoins-price-chop...,9,0.133216,3.857143,0.8,0.866667,1.888619,1.309442


In [5]:
# Create and display plot
fig = plot_momentum_vs_price(df, DEFAULT_ASSET_ID)
# Save visualization
fig.write_html(OUTPUT_PLOT_FILEPATH)
fig.show()

# The output has been saved in `algotrade_sentiment/output/momentum_vs_price.png` to illustrate the resulting chart.

This concludes our tutorial on analyzing crypto sentiment and price data! We've covered how to:
- Fetch and process news and market data
- Calculate momentum and trend metrics
- Visualize the relationship between sentiment and price movements

Feel free to experiment with different visualization types, metrics, and analysis approaches. The possibilities are endless!

