# Cryptocurrency Analysis with Python: Retrieving Crypto Data from CoinGecko
Find more tutorials about Computer Vision, Microscopy, Biology and Data Science [here](https://medium.com/@microbioscopicdata)

Welcome back to our tutorial series on Cryptocurrency Analysis with Python! In our previous tutorials, we explored how to use powerful Python libraries such as Matplotlib, mplfinance, and yfinance to load and visualize cryptocurrency data from popular sources like Yahoo Finance. We’ve discussed about cryptocurrency exchanges, trade expenses (visible and hidden costs) and what you should really know about them before you create an account and before you start trading with them. We’ve also introduced various impactful risk-adjusted performance metrics: such as the Calmar Ratio, the Sharpe Ratio, and the Sortino Ratio. We successfully implemented the Simple Moving Average (SMA) Crossover Strategy, calculated associated trading costs encompassing commissions, spreads, and slippage, and further optimized the strategy by exploring diverse parameter combinations to identify optimal settings.

In this  tutorial, we will delve into the process of retrieving detailed and in-depth cryptocurrency data from CoinGecko, a popular data aggregator, using Python.
    

Disclaimer: This article is not financial advice. This is purely introductory knowledge. All investment-related queries should be directed to your financial advisor.

Disclaimer: I am not affiliated with CoinGecko. I personally utilize their platform for retrieving cryptocurrency data.

## Cryptocurrency data aggregators

Cryptocurrency data aggregators simplify the process of accessing a wide range of brokers and cryptocurrency assets through a single interface. They do this by utilizing Application Programming Interfaces (APIs), which act as bridges for different computer programs to communicate and get prices and diverse market data from various exchanges. This information is then presented in a unified format, making it user-friendly. These aggregators also offer additional features such as charts, portfolio tracking, live news updates, and customizable alerts [1].

In this tutorial, we'll be using CoinGecko, a popular data aggregator, for several reasons:

- It consolidates data from numerous cryptocurrency exchanges and sources, providing a comprehensive view of the cryptocurrency market.
- It covers a wide array of cryptocurrencies, not just the most popular ones.
- It offers historical data, allowing us to analyze price trends, trading volumes, and market capitalization over time.
- It provides various market metrics, including trading volumes, market capitalization, circulating supply, and more.
- Most importantly, it offers a free and powerful API that enables us to retrieve real-time cryptocurrency data efficiently.


One of the limitations of CoinGecko is in its free version, which imposes a rate limit of 20-30 calls per minute (meaning that we make a maximum of 20 to 30 API requests/calls to their server in a single minute). However, we can address this issue by implementing Python code and incorporating waiting times [2].

CoinGecko, there are several other cryptocurrency data aggregators and sources available. Here are some of them:

- CoinMarketCap
- Messari 
- CoinCap
- Coinlore  

and many more...

## Using CoinGecko API 

In order to interact with CoinGecko's data and services, we'll use the CoinGecko API wrapper [**pycoingecko**](https://github.com/man-c/pycoingecko). To install the pycoingecko, we must use `pip install -U pycoingecko` command you in our computer's command prompt or terminal. Alternatively, we can use the `!pip install -U pycoingecko` command directly in our Jupyter Notebook [3].


The code below imports the Python libraries we will use in this tutorial, creates an instance of the CoinGeckoAPI class, assigns it to the variable `cg`, and checks the connectivity to the server.

In [3]:
# !pip install -U pycoingecko # install the CoinGecko API wrapper, pycoingecko

# Import Libraries
import pandas as pd
import matplotlib.pyplot as plt
from pycoingecko import CoinGeckoAPI
import time
import math


# Create an instance of the CoinGeckoAPI class and assigns it to the variable cg
cg = CoinGeckoAPI() # connect

# Check API connectivity using ping method
cg.ping()

{'gecko_says': '(V3) To the Moon!'}

### Simple calls for one coin

The code below retrieves the current price of Bitcoin (BTC) in Euros (EUR). The parameter:

- `ids` specifies the cryptocurrency we want to retrieve the price for.
- `vs_currencies` specifies the fiat currency in which we want to get the price. Here, we select the euro, but we can pass also other currencies. We can retrieve all the supported fiat currencies using  `cg.get_supported_vs_currencies()` function.

In [4]:
# Retrieve data for Bitcoin (BTC) from the CoinGecko API
cg.get_price(ids="bitcoin", vs_currencies ="eur")

{'bitcoin': {'eur': 94792}}

### Simple calls for one coin with more parameters

The code below retrieves the current price of Bitcoin (BTC) in Euros (EUR,) also inludes its market capitalization, 24-hour trading volume, 24-hour price change, and the timestamp of the last update. 

In [12]:
# Retrieve data for Bitcoin (BTC) from the CoinGecko API
data = cg.get_price(ids="bitcoin", vs_currencies ="usd", include_market_cap=True,
            include_24hr_vol=True, include_24hr_change=True, include_last_updated_at=True)

# Convert the retrieved data (in JSON format) into a pandas DataFrame
data = pd.DataFrame(data)

# Convert the "last_updated_at" column of the Bitcoin data to a datetime format.
data["bitcoin"]["last_updated_at"] = pd.to_datetime(data["bitcoin"]["last_updated_at"], unit="s")
data

You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  data["bitcoin"]["last_updated_at"] = pd.to_datetime(data["bitcoin"]["last_updated_at"], unit="s")
  data["bitcoin"]["last_updated_at"] = pd.to_datetime(data["bitcoin"]["last_updated_at"], unit="s")


Unnamed: 0,bitcoin
usd,97779.0
usd_market_cap,1938350307242.729736
usd_24h_vol,38754038491.49572
usd_24h_change,0.86193
last_updated_at,2025-02-10 10:39:05


### Simple calls for more coins with more parameters

If we want to download data for more cryptocurrencies, we can pass a list of the cryptocurrency symbols* we want to retrieve prices for in our code above. Additionally, we'll iterate over each column to convert the 'last_updated_at' column for each cryptocurrency to a datetime format.

**For retrieving the symbol of a cryptocurrency, we can search for it on the CoinGecko page.*

In [13]:
# Retrieve data for Bitcoin (BTC) from the CoinGecko API
data = cg.get_price(ids=["bitcoin","ethereum","binancecoin","polkadot"], vs_currencies ="usd", include_market_cap=True,
            include_24hr_vol=True, include_24hr_change=True, include_last_updated_at=True)

# Convert the retrieved data (in JSON format) into a pandas DataFrame
data = pd.DataFrame(data)

# Iterate through the columns and convert "last_updated_at" to datetime
for column in data.columns:
    # Convert the "last_updated_at" column of every crypto  to a datetime format.
    data[column]["last_updated_at"] = pd.to_datetime(data[column]["last_updated_at"], unit="s")
data

You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  data[column]["last_updated_at"] = pd.to_datetime(data[column]["last_updated_at"], unit="s")
  data[column]["last_updated_at"] = pd.to_datetime(data[column]["last_updated_at"], unit="s")
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in

Unnamed: 0,binancecoin,bitcoin,ethereum,polkadot
usd,604.99,97791.0,2649.75,4.83
usd_market_cap,88333617294.338684,1939530444969.126709,319749835016.640808,7353791026.637767
usd_24h_vol,1545569558.573529,38407937064.268562,20918075149.845238,248446868.796668
usd_24h_change,-5.178899,0.903678,-0.620053,-0.596785
last_updated_at,2025-02-10 10:43:01,2025-02-10 10:42:58,2025-02-10 10:43:01,2025-02-10 10:42:58


## Retrieve data for all available coins in CoinGecko

### Retrieve the list with all available coins in CoinGecko

To retrieve all available coins in CoinGecko, we'll use the function `cg.get_coins_list()`.

In [14]:
# Retrieve a list of available cryptocurrencies from the CoinGecko API
crypto_list = cg.get_coins_list()

# Convert the list into a pandas DataFrame for easier manipulation and analysis
crypto_list = pd.DataFrame(crypto_list)

### Find the duplicated names and symbols
However, we must be careful because not all names or symbols are unique, as we can see below when using the `nunique()` function. The number of coins is calculated by the different IDs.

In [15]:
# Use the nunique() function to count the number of unique values in each column
unique_counts = crypto_list.nunique()
unique_counts

id        17092
symbol    13007
name      16760
dtype: int64

In order to find the duplicated names and symbols, we are going to use the code below that includes the `.duplicated` function.

In [16]:
# Check for duplicate cryptocurrency names
duplicate_names = crypto_list[crypto_list['name'].duplicated(keep=False)]

# Check for duplicate cryptocurrency symbols
duplicate_symbols = crypto_list[crypto_list['symbol'].duplicated(keep=False)]

# Display the duplicate names and symbols
print("Duplicate Names:")
print(duplicate_names[["id",'name', 'symbol']])

print("\nDuplicate Symbols:")
print(duplicate_symbols[["id",'name', 'symbol']])

Duplicate Names:
                        id        name  symbol
9      0x678-landwolf-1933    Landwolf    wolf
483             aintivirus  AIntivirus   ainti
484           aintivirus-2  AIntivirus   ainti
556              akita-inu   Akita Inu   akita
557            akita-inu-2   Akita Inu     akt
...                    ...         ...     ...
16988               zeus-2        Zeus    zeus
16992         zeuspepesdog        Zeus    zeus
17060               zoomer      Zoomer  zoomer
17061             zoomer-2      Zoomer  zoomer
17062           zoomer-sol      Zoomer  zoomer

[608 rows x 3 columns]

Duplicate Symbols:
                        id           name  symbol
0                        _     ༼ つ ◕_◕ ༽つ     gib
6            0vix-protocol  0VIX Protocol     vix
9      0x678-landwolf-1933       Landwolf    wolf
11                  0xcoco         0xCoco    coco
13               0xdefcafe      0xDEFCAFE    cafe
...                    ...            ...     ...
17078                 zus

To retrieve market data (ordered by market capitalization in descending order) for all available coins in CoinGecko, we'll use the function `cg.get_coins_markets()`.   

**For an unknown reason, I wasn't able to set the `per_page` parameter to its maximum value of 250 (the code did not run). For this reason, I left it at the default value of 100.**


In [17]:
data = cg.get_coins_markets(vs_currency ="usd",page=1, order="market_cap_dec")
df = pd.DataFrame(data)
df

Unnamed: 0,id,symbol,name,image,current_price,market_cap,market_cap_rank,fully_diluted_valuation,total_volume,high_24h,...,total_supply,max_supply,ath,ath_change_percentage,ath_date,atl,atl_change_percentage,atl_date,roi,last_updated
0,bitcoin,btc,Bitcoin,https://coin-images.coingecko.com/coins/images...,97739.000000,1939530444969,1,1939530444969,38712332722,97953.000000,...,1.982232e+07,2.100000e+07,108786.000000,-10.05344,2025-01-20T09:11:54.494Z,6.781000e+01,144200.67149,2013-07-06T00:00:00.000Z,,2025-02-10T10:43:26.828Z
1,ethereum,eth,Ethereum,https://coin-images.coingecko.com/coins/images...,2648.570000,319749835017,2,319749835017,18131102191,2666.290000,...,1.205382e+08,,4878.260000,-45.62129,2021-11-10T14:24:19.604Z,4.329790e-01,612570.86683,2015-10-20T00:00:00.000Z,"{'times': 35.221458613060086, 'currency': 'btc...",2025-02-10T10:43:26.915Z
2,tether,usdt,Tether,https://coin-images.coingecko.com/coins/images...,1.000000,141637541264,3,141637541264,58186878246,1.000000,...,1.416185e+11,,1.320000,-24.40947,2018-07-24T00:00:00.000Z,5.725210e-01,74.68962,2015-03-02T00:00:00.000Z,,2025-02-10T10:43:27.192Z
3,ripple,xrp,XRP,https://coin-images.coingecko.com/coins/images...,2.430000,140513047289,4,243226961347,4880597334,2.470000,...,9.998650e+10,1.000000e+11,3.400000,-28.41961,2018-01-07T00:00:00.000Z,2.686210e-03,90459.81537,2014-05-22T00:00:00.000Z,,2025-02-10T10:43:25.453Z
4,solana,sol,Solana,https://coin-images.coingecko.com/coins/images...,204.090000,99578984255,5,121188173085,4708600227,205.610000,...,5.937994e+08,,293.310000,-30.35403,2025-01-19T11:15:27.957Z,5.008010e-01,40690.63127,2020-05-11T19:35:23.449Z,,2025-02-10T10:43:23.751Z
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,eos,eos,EOS,https://coin-images.coingecko.com/coins/images...,0.641925,975792937,96,1350902740,109586925,0.644947,...,2.100000e+09,2.100000e+09,22.710000,-97.16659,2018-04-29T07:50:33.540Z,4.027460e-01,59.77912,2024-11-04T21:55:59.771Z,"{'times': -0.3515905750975274, 'currency': 'us...",2025-02-10T10:43:25.259Z
96,gala,gala,GALA,https://coin-images.coingecko.com/coins/images...,0.022624,973583360,97,973584539,111604539,0.022920,...,4.294583e+10,5.000000e+10,0.824837,-97.25450,2021-11-26T01:03:48.731Z,1.347500e-04,16705.92409,2020-12-28T08:46:48.367Z,,2025-02-10T10:43:26.773Z
97,the-sandbox,sand,The Sandbox,https://coin-images.coingecko.com/coins/images...,0.389442,955018729,98,1171391476,98442230,0.393765,...,3.000000e+09,3.000000e+09,8.400000,-95.35435,2021-11-25T06:04:40.957Z,2.897764e-02,1246.15977,2020-11-04T15:59:14.441Z,,2025-02-10T10:43:25.223Z
98,floki,floki,FLOKI,https://coin-images.coingecko.com/coins/images...,0.000094,913455545,99,944405415,182406573,0.000098,...,1.000000e+13,1.000000e+13,0.000345,-72.62463,2024-06-05T07:25:59.137Z,8.428000e-08,111944.84716,2021-07-06T01:11:20.438Z,,2025-02-10T10:43:25.735Z


The limitation of the code above is that **we are retrieving data for the top 100 cryptocurrencies**. To retrieve market data for all available cryptocurrencies using the CoinGecko API, you typically need to paginate through the data, as there are too many cryptocurrencies to retrieve in a single request (we have found previously more than 10000 available coins using the `cg.get_coins_list()` function). 


In [19]:
# Define an empty DataFrame to store the results
all_crypto_data = pd.DataFrame()

# Set the number of cryptocurrencies to retrieve per page and initialize the page number
page = 1

while True:
    # Retrieve market data for the current page of cryptocurrencies
    data = cg.get_coins_markets(vs_currency="usd", order="market_cap_desc", page=page)

    # Check if the data is empty, indicating no more results
    if not data:
        print("Download completed")
        break

    # Convert the current page of data into a DataFrame
    page_crypto_data = pd.DataFrame(data)

    # Append the current page's data to the overall DataFrame
    all_crypto_data = pd.concat([all_crypto_data, page_crypto_data], ignore_index=True)

    # Increment the page number for the next request
    page += 1
    print(f"Page: {page}", end="\r")
    # keep in mind that the CoinGecko  have rate limits when fetching large datasets, if needed, use time.sleep
    time.sleep(1)

# Now, all_crypto_data contains market data for all available cryptocurrencies
all_crypto_data

Page: 25

KeyboardInterrupt: 

In [18]:
# Save the DataFrame to a CSV file
all_crypto_data.to_csv('all_crypto_data_Cee.csv', index=False)

NameError: name 'all_crypto_data' is not defined

## Conclusions

In this tutorial, we explored how to retrieve market data for all available cryptocurrencies from a popular data aggregator, CoinGecko using Python. We saw how to utilize the CoinGecko API wrapper [**pycoingecko**](https://github.com/man-c/pycoingecko) to overcome the limitations of retrieving large datasets (market data for all available cryptocurrencies), due to limitations of free use on CoinGecko. With the obtained dataset, we have the flexibility to perform various analyses, track market trends, and make informed decisions. Before that, however, we need to understand and clean the data, which we will explore in our next tutorial.




If you enjoy reading stories of this nature and wish to show your support for my writing, you may contemplate becoming a Medium member. By subscribing for just $5 per month, you’ll gain boundless entry to a vast collection of Python guides and Data science articles. Additionally, if you choose to sign up through my referral link, I’ll receive a modest commission at no extra cost to you.

## References:
[1]	“Top 15 Cryptocurrency Data Aggregators that Everyone Should Use – Cryptopolitan.” https://www.cryptopolitan.com/top-15-cryptocurrency-data-aggregators/ (accessed Aug. 23, 2023).  

[2]	“Crypto API Pricing Plans,” CoinGecko. https://www.coingecko.com/en/api/pricing (accessed Aug. 23, 2023).  

[3]	M. Christoforou, “CoinGecko API wrapper.” Aug. 06, 2023. Accessed: Aug. 23, 2023. [Online]. Available: https://github.com/man-c/pycoingecko
