# Étape 1 / Exploration de données non structurées

Récolter deux types de données en passant par l’API Binance.  

* Grâce à cette API, on peut aller récupérer des informations sur les cours des différents marchés (BTC-USDT, BTC-ETH, …).
* Le but sera de créer une fonction de récupération de données générique afin de pouvoir avoir les données de n’importe quel marché.

## Explications générales

>**API Terminology**

These terms will be used throughout the documentation, so it is recommended that you read them to enhance your understanding of the API (especially for new users).
- **Base asset** refers to the asset that is the quantity of a symbol; for the symbol BTCUSDT, BTC would be the base asset.
- **Quote asset** refers to the asset that is the price of a symbol; for the symbol BTCUSDT, USDT would be the quote asset.  

>**Symbol Status**
- PRE_TRADING
- TRADING
- POST_TRADING
- END_OF_DAY
- HALT
- AUCTION_MATCH
- BREAK  

>**Status	Description**

`NEW`	The order has been accepted by the engine  
`PARTIALLY_FILLED`	Part of the order has been filled  
`FILLED`	The order has been completed  
`CANCELED`	The order has been canceled by the user  
`PENDING_CANCEL`	This is currently unused  
`REJECTED`	The order was not accepted by the engine and not processed  
`EXPIRED`	The order was canceled according to the order type's rules (e.g., LIMIT FOK orders with no fill, LIMIT IOC, or MARKET orders that partially fill), or by the exchange(e.g., orders canceled during liquidation or orders canceled during maintenance)  
`EXPIRED_IN_MATCH`	The order was canceled by the exchange due to STP. (e.g. an order with EXPIRE_TAKER will match with existing orders on the book with the same account or same tradeGroupId)  

## Données de base des paires

In [2]:
import requests
import pandas as pd

# Format d'affichage plus lisible pour les floats
#pd.options.display.float_format = '{:.8f}'.format
pd.options.display.float_format = lambda x: f'{x:.10f}'.rstrip('0').rstrip('.') if '.' in f'{x:.10f}' else f'{x:.10f}' # formatage des floats pour enlever les zéros inutiles

def get_pairs_list():
    url = 'https://api.binance.com/api/v3/exchangeInfo'
    data = requests.get(url).json()
    # Liste de toutes les paires
    pairs_list = data['symbols']
    return pairs_list

def get_pair_info(pair):
    filtres = { f['filterType']: f for f in pair['filters'] } # on transforme la liste de filtres en dictionnaire pour un accès plus facile

    return {
        'symbol': pair['symbol'],
        'baseAsset': pair['baseAsset'],
        'quoteAsset': pair['quoteAsset'],
        'status': pair['status'],
        'minPrice': filtres.get('PRICE_FILTER', {}).get('minPrice'),
        'maxPrice': filtres.get('PRICE_FILTER', {}).get('maxPrice'),
        'tickSize': filtres.get('PRICE_FILTER', {}).get('tickSize'),
        'minQty': filtres.get('LOT_SIZE', {}).get('minQty'),
        'maxQty': filtres.get('LOT_SIZE', {}).get('maxQty'),
        'stepSize': filtres.get('LOT_SIZE', {}).get('stepSize'),
        'minNotional': filtres.get('NOTIONAL', {}).get('minNotional'),
        'maxNotional': filtres.get('NOTIONAL', {}).get('maxNotional'),
    }

pairs_list = get_pairs_list()
pairs_general_infos = [get_pair_info(pair) for pair in pairs_list]

# Conversion en DataFrame
df_pairs_general_infos = pd.DataFrame(pairs_general_infos)

# Conversion des colonnes numériques en float
df_pairs_general_infos[['minPrice', 'maxPrice', 'tickSize', 'minQty', 'maxQty', 'stepSize', 'minNotional', 'maxNotional']] = df_pairs_general_infos[['minPrice', 'maxPrice', 'tickSize', 'minQty', 'maxQty', 'stepSize', 'minNotional', 'maxNotional']].astype(float)

# Affichage
display(df_pairs_general_infos.head())

Unnamed: 0,symbol,baseAsset,quoteAsset,status,minPrice,maxPrice,tickSize,minQty,maxQty,stepSize,minNotional,maxNotional
0,ETHBTC,ETH,BTC,TRADING,1e-05,922327,1e-05,0.0001,100000,0.0001,0.0001,9000000
1,LTCBTC,LTC,BTC,TRADING,1e-06,100000,1e-06,0.001,100000,0.001,0.0001,9000000
2,BNBBTC,BNB,BTC,TRADING,1e-06,100000,1e-06,0.001,100000,0.001,0.0001,9000000
3,NEOBTC,NEO,BTC,TRADING,1e-07,100000,1e-07,0.01,100000,0.01,0.0001,9000000
4,QTUMETH,QTUM,ETH,TRADING,1e-06,1000,1e-06,0.1,90000000,0.1,0.001,9000000


## Récupération des trades récents

In [11]:
import requests
import pandas as pd

#définition fonction pour récupérer les symbols dans une liste
def symbol_list():
  resp = requests.get('https://api.binance.us/api/v3/ticker/price').json()
  data = pd.DataFrame(resp)
  #initialisation de la liste
  symbols_list = []
  #boucle pour récupérer les symbols
  for i in range(len(data)):
    symbols_list.append(data['symbol'][i])
  return symbols_list

#application de la fonction pour récupérer les symbols
symbols_list = symbol_list()

#définition de la fonction pour la récupération dans un dataframe des trades recent par pair
def recup_trades(pair) :
  trades = requests.get('https://api.binance.us/api/v3/trades?symbol={}'.format(pair)).json()
  trades = pd.DataFrame(trades)
  trades['symbol'] = pair
  return trades

#récupération dans un dataframe des trades récents par pair
trades_data = pd.DataFrame({})      #initialisation du dataframe

for symbol in symbols_list :             #boucle pour récupérer les infos des trades
    trades = recup_trades(symbol)
    #display(trades.head(3))
    trades_data = pd.concat([trades_data, trades], ignore_index=True)

trades_data = pd.DataFrame(trades_data)

#affichage du dataframe
trades_data.head()

Unnamed: 0,id,price,qty,quoteQty,time,isBuyerMaker,isBestMatch,symbol
0,65159136,22926.44,0.008152,186.8963,1675658469799,False,True,BTCUSD4
1,65159137,22922.19,0.010815,247.9034,1675658472571,True,True,BTCUSD4
2,65159138,22925.79,0.012616,289.2317,1675658473351,False,True,BTCUSD4
3,65159139,22923.4,0.004365,100.0606,1675658474305,True,True,BTCUSD4
4,65159140,22925.36,0.000788,18.0651,1675658492101,True,True,BTCUSD4


In [13]:
#Gestion de la variable time sur le df trades_data
trades_data['time'] = pd.to_datetime(trades_data['time'], unit='ms')
trades_data.head()

Unnamed: 0,id,price,qty,quoteQty,time,isBuyerMaker,isBestMatch,symbol
0,65159136,22926.44,0.008152,186.8963,2023-02-06 04:41:09.799,False,True,BTCUSD4
1,65159137,22922.19,0.010815,247.9034,2023-02-06 04:41:12.571,True,True,BTCUSD4
2,65159138,22925.79,0.012616,289.2317,2023-02-06 04:41:13.351,False,True,BTCUSD4
3,65159139,22923.4,0.004365,100.0606,2023-02-06 04:41:14.305,True,True,BTCUSD4
4,65159140,22925.36,0.000788,18.0651,2023-02-06 04:41:32.101,True,True,BTCUSD4
