# Webscraping Crytpo-Exchanges with Python

This is a brief example for creating a webscraping script with python in order to obtain a list of exchanges that list a specific crypto-asset. Another example of pulling data from a website can be found here: https://www.dataquest.io/blog/web-scraping-tutorial-python/. 

To begin we are using three different python libraries for this script: 1) requests; 2) BeautifulSoup; and 3) sys. We need to import these libraries in order to run a webscraping script. Run the following two commands to import these libraries. 

In [3]:
# Import Libraries

import sys
import requests
from bs4 import BeautifulSoup

Now that we have the appropriate libraries imported we need to place a piece of code to request the url we are trying to obtain information from and the parse the html format in order to scrap data from it.

In [6]:
# sets page as a variable that requests the coinmarketcap data on Bitcoin markets
page = requests.get("https://coinmarketcap.com/currencies/bitcoin/#markets")

# uses BeautifulSoup to parse the page variable described above.
soup = BeautifulSoup(page.content, 'html.parser')

We are now ready to look at the different html elements within the url selected.  See the following link for more on html elements or the link above: https://www.w3schools.com/Html/html_elements.asp. 

Within the coinmarketcap webpage, select the developer tools or inspect element --depending on the browser you are using. I am using Chrome, so I can right click and select inspect elements at the bottom or Ctrl + shift + I.

This will provide a pop-out box of all of the elements that make up the page we are trying to pull data from. Hover around the different elements and notice how it will highlight which data point is being tagged or classified.

If you go right above the Markets table and select inspect elements, you should see the table highlighted similar to the image below:

![image.png](attachment:image.png)

Here we can identifiy the specific class or id that the html page uses on the data we want. In this case the table we want to extract has an id of "markets-table". Note that id's are the best elements to obtain when available, as there can only be one type of id. From this information we can write the following code to pull information within this table.

In [7]:
# sets markets_table as a variable and will use the soup variable above to find id="markets-table"
markets_table = soup.find(id="markets-table")

In [8]:
# This code will print all of the markets_table information.
print(markets_table)

<table class="table no-border table-condensed floating-header" id="markets-table">
<thead>
<tr>
<th class="text-right sortable">#</th>
<th class="sortable" id="th-source">Source</th>
<th class="sortable" id="th-pair">Pair</th>
<th class="text-right sortable">Volume (24h)</th>
<th class="text-right sortable">Price</th>
<th class="text-right sortable">Volume (%)</th>
<th class="text-right sortable">Updated</th>
</tr>
</thead>
<tbody>
<tr>
<td class="text-right">1</td>
<td data-sort="Bit-Z"><img alt="Bit-Z" class="logo" src="https://s2.coinmarketcap.com/static/img/exchanges/16x16/300.png"/><a class="link-secondary" href="/exchanges/bit-z/">Bit-Z</a></td>
<td data-sort="ETH/BTC"><a href="https://www.bit-z.com/trade/eth_btc" target="_blank">ETH/BTC</a></td>
<td class="text-right" data-sort="1056750000.0">
<span class="volume" data-btc="171090.0" data-native="2372960.0" data-usd="1056750000.0">
$1,056,750,000
</span>
</td>
<td class="text-right" data-sort="6170.39">
<span class="price" data-

We do not need all of the information that is provided within the table, as we are only looking for the name of exchanges that Bitcoin is listed on--based on coinmarketcap. As such, look through the elements that are sub-classed under the markets-table identifier. By doing this, you should be able to see that the names are placed between a tags and class="link-secondary" similar to the image below:

![image.png](attachment:image.png)

If you go to the other exchange names, you will notice that they all have the same class. As such, we can use the following code to pull all classes related to the link-secondary class within this table 

In [9]:
# uses mark as a variable to select all elements that have link secondary class.
# make sure to inlucde the . before you type the class you are trying to select
mark = markets_table.select(".link-secondary")

# this will make the link-secondary class selected above into a more readable format
exchanges = [e.get_text() for e in mark]

Now we should be ready to see the results

In [10]:
print(exchanges)

['Bit-Z', 'OKEx', 'ZB.COM', 'Binance', 'Bitfinex', 'Simex', 'ZB.COM', 'HitBTC', 'OKEx', 'Huobi', 'ZB.COM', 'Coinsuper', 'Bitstamp', 'OKEx', 'OKEx', 'BCEX', 'OKEx', 'OOOBTC', 'Kraken', 'GDAX', 'EXX', 'LBank', 'OKEx', 'EXX', 'bitFlyer', 'itBit', 'Binance', 'Bibox', 'Binance', 'DigiFinex', 'OKEx', 'HitBTC', 'Bibox', 'RightBTC', 'Bibox', 'HitBTC', 'Upbit', 'BCEX', 'BTCBOX', 'TOPBTC', 'Bit-Z', 'Bibox', 'HitBTC', 'BitForex', 'LBank', 'Binance', 'IDAX', 'Kraken', 'CoinsBank', 'Huobi', 'Huobi', 'HADAX', 'Allcoin', 'Gemini', 'Huobi', 'LBank', 'CoinsBank', 'Binance', 'OEX', 'Bithumb', 'B2BX', 'OOOBTC', 'Bit-Z', 'Binance', 'LBank', 'Huobi', 'OKEx', 'Allcoin', 'Bit-Z', 'Binance', 'OKEx', 'HitBTC', 'Bitfinex', 'Trade By Trade', 'BCEX', 'Coindeal', 'RightBTC', 'Livecoin', 'Binance', 'Binance', 'BCEX', 'Binance', 'Simex', 'Binance', 'YoBit', 'DigiFinex', 'LBank', 'Exmo', 'Binance', 'Bitbank', 'OKEx', 'Binance', 'DragonEX', 'Binance', 'TOPBTC', 'Exmo', 'BCEX', 'CoinsBank', 'CoinTiger', 'Bittrex', 'Bit