### Write well, Im going to be using this for a long time

#### Data we need:
user input:
- investment amount, trading pair -> amt0, amt1
- start time and end time
- time period that you assume fixed swap price, swap volumes or liquidity positions
- upper and lower price
- pool_fee_rate

data from api:
- cprice of each time period (tick, 1.0001 ** i)
- L_pool at each time period at specific pool_fee_rate (liquidity?, or simply total X tokens + Y tokens in USD)
- Swap volume at each time period at specific pool_fee_rate (volumeUSD?)
- Gas cost to mint at each time period

--------------------------------------------------------------------------------------------------------------

#### Fees
The liquidity amount is calculated from the following numbers that describe a position: 
- amount of token 0 (amt0), amount of token 1 (amt1), 
- price (as x token 1's per token 0) at the upper limit of the position (upper), 
- price at the lower limit of the position (lower) 
- and the current swap price (cprice). 

Then liquidity (L_you?) for a position is calculated as follows:

Case 1: cprice <= lower
- liquidity = amt0 * (sqrt(upper) * sqrt(lower)) / (sqrt(upper) - sqrt(lower))

Case 2: lower < cprice <= upper
- liquidity is the min of the following two calculations:
- amt0 * (sqrt(upper) * sqrt(cprice)) / (sqrt(upper) - sqrt(cprice))
- amt1 / (sqrt(cprice) - sqrt(lower))

Case 3: upper < cprice
- liquidity = amt1 / (sqrt(upper) - sqrt(lower))

Resources
- liquidity can use this code: https://github.com/JNP777/UNI_V3-Liquitidy-amounts-calcs/blob/main/UNI_v3_funcs.py

Fee is calculated by:
- Fee income = (L_you/L_pool) * swap volume under fixed time period * pool_fee_rate/100
- L_you also should be for that specific ticks only, not the whole amount you provided for. Its not linear, its calculated from the 3 cases above
- Does Case1 and Case3's fee be 0 regardless?


reference: https://uniswapv3.flipsidecrypto.com/
- check my numbers with the reference from the website

----------------------------------------------------------------------------------------

#### Impermanent Loss (is this v2 or v3)
- IL (in %) = (2 sqrt(p) / (p+1) ) - 1
- where p = r_t1/r_t2
- and r_t is a price in b at time 1
- Net $ loss = total asset value in dollars at stake time * IL (in%)

reference: https://chainbulletin.com/impermanent-loss-explained-with-examples-math/#:~:text=Impermanent%20loss%20is%20the%20difference,is%20equal%20to%20200%20DAI

--------------------------------------------------------------------------------------------------------------

#### Other cost

Gas_costs_mint = 500000 gwei * gas_price at that time (??? double check actual cost)

### PNL/APR
-PNL = Acumulated Fees_accrued (dolar value at generation) - IL - Gas_costs_mint

-APR = PNL/Initial_capital*(age of the position / year time)

--------------------------------------------------------------------------------------------------------------

## Dependencies

In [1]:
import requests
import json
import pandas as pd
import math

## Main Functions

In [2]:
# function to use requests.post to make an API call to the subgraph url
def run_query(q):

    # endpoint where you are making the request
    request = requests.post('https://api.thegraph.com/subgraphs/name/uniswap/uniswap-v3'
                            '',json={'query': q})
    if request.status_code == 200:
        return request
    else:
        raise Exception('Query failed. return code is {}.      {}'.format(request.status_code, query))
        
        
# turns requests into dataframe        
def results_to_df(query_result):
    json_data_ = json.loads(query_result.text)
    df_data_ = json_data_['data']['pools']
    df_ = pd.DataFrame(df_data_)

    return df_

In [3]:
def get_token_id(symbol):
    
    # default should be first:10, in case there are more than 1 coins with the same symbol
    query_ = """ 
    {{
      tokens(first:1, where:{{symbol: "{}"}}) {{
        id
        symbol
        name
      }}
    }}""".format(symbol)
    
    # run query
    query_result_ = run_query(query_)
    json_data_ = json.loads(query_result_.text)
    
    print(' ')
    print('get_token_id: {}'.format(symbol))
    print(json_data_)
    
    # make sure only return 1 object
    if len(json_data_['data']['tokens']) == 1:
        token_id_ = json_data_['data']['tokens'][0]['id']
        return token_id_
        
    else:
        print(json_data_['data'])
        raise Exception('Returned number of token_ids != 1')

        
def get_pool_id(token0_id, token1_id, feeTier):
    query_ = """
    {{
      pools(first: 10, 
        where:{{token0: "{}",
        token1: "{}",
        feeTier:"{}" }}) 
      {{
        id
        token0{{symbol}}
        token1{{symbol}}
        feeTier
      }}
    }}""".format(token0_id, token1_id, feeTier)
    
    
    # run query
    query_result_ = run_query(query_)
    json_data_ = json.loads(query_result_.text)
    
    print('\n get_pool_id for feeTier: {}'.format(feeTier))
    print(json_data_)
    
    # make sure there is only 1 pool that matches exactly
    if len(json_data_['data']['pools']) == 1:
        pool_id_ = json_data_['data']['pools'][0]['id']
        return pool_id_
    else:
        print(json_data_['data'])
        raise Exception('Returned number of token_ids != 1')

        
    return json_data_

In [4]:
def get_poolDayDatas(pool_id, num_datapoints=1000):
    # input: pool_id
    # num_datapoints (must be multiple of max_request_)
    
    max_request_ = 1000
    quotient_ = math.floor(num_datapoints/max_request_)
    remainder_ = num_datapoints%max_request_
            
    query_base_ = '''
    {{
      poolDayDatas(first:{},
      skip: {},
        where:{{ pool: "{}" }},
      orderBy:date,
      orderDirection: desc) 
      {{
        date
        tick
        liquidity
        volumeUSD
        pool{{
            token0{{
                symbol
            }}
            token1{{
                symbol
            }}
        }}
      }}
    }}'''
    
    poolDayDatas_array_ = []
    
    # query loop
    for i in range(quotient_):
        q_first_ = max_request_
        q_next_ = i*max_request_
        query_ = query_base_.format(q_first_, q_next_, pool_id)
        query_result_ = run_query(query_)
        json_data_ = json.loads(query_result_.text)
#         print(json_data_)
        poolDayDatas_array_ += json_data_['data']['poolDayDatas']
    
    print(' ')
    print('\n Queried PoolDayDatas, total of {} datapoints'.format(str(len(poolDayDatas_array_))))
    print('example:')
    print(poolDayDatas_array_[0])
    
    # array to dataframe
    df_ = pd.json_normalize(poolDayDatas_array_)
    df_.drop_duplicates(subset=['date']) 
    
    return df_

In [9]:
def get_poolHourDatas(pool_id, num_datapoints=3000):
    # input: pool_id
    # num_datapoints (must be multiple of max_request_)
    
    max_request_ = 1000
    quotient_ = math.floor(num_datapoints/max_request_)
    remainder_ = num_datapoints%max_request_
            
    query_base_ = '''
    {{
      poolHourDatas(first:{},
      skip: {},
        where:{{ pool: "{}" }},
      orderBy:periodStartUnix,
      orderDirection: desc) 
      {{
        periodStartUnix
        pool{{
            token0{{
                symbol
            }}
            token1{{
                symbol
            }}
        }}
        liquidity
        sqrtPrice
        token0Price
        token1Price
        tick
        feeGrowthGlobal0X128
        feeGrowthGlobal1X128
        tvlUSD
        volumeToken0
        volumeToken1
        volumeUSD
        feesUSD
        txCount
        open
        high
        low
        close
      }}
    }}'''
    
    poolDayDatas_array_ = []
    
    # query loop
    for i in range(quotient_):
        q_first_ = max_request_
        q_next_ = i*max_request_
        query_ = query_base_.format(q_first_, q_next_, pool_id)
        query_result_ = run_query(query_)
        json_data_ = json.loads(query_result_.text)
#         print(json_data_)
        try:
            poolDayDatas_array_ += json_data_['data']['poolHourDatas']
        except Exception:
            print('.. Pass')
            pass
    
    print(' ')
    print('\n Queried poolHourDatas, total of {} datapoints'.format(str(len(poolDayDatas_array_))))
    print('example:')
    print(poolDayDatas_array_[0])
    
    # array to dataframe
    df_ = pd.json_normalize(poolDayDatas_array_)
    df_.drop_duplicates(subset=['periodStartUnix']) # TODO: BUGGGG
    
    return df_

In [14]:
def get_swaps(pool_id, time_start='1627369200', time_end='1623772800', num_datapoints=20000):
    # input: pool_id
    # num_datapoints (must be multiple of max_request_)
    
    max_request_ = 1000
    quotient_ = math.floor(num_datapoints/max_request_)
    remainder_ = num_datapoints%max_request_
           
    ## TODO: BUGG
    query_base_ = '''
    {{
      swaps(first:{}, skip: {},
            where:{{ pool: "{}",
            timestamp_lt: "{}",
            timestamp_gt: "{}"}},
          orderBy:timestamp,
          orderDirection: desc){{
        transaction {{
          blockNumber
          timestamp
          gasUsed
          gasPrice
        }}
        id
        timestamp
        tick
        amount0
        amount1
        amountUSD
        sqrtPriceX96
      }}
    }}'''
    
    swap_arrays_ = []
    
    # query loop
    for i in range(quotient_):
        q_first_ = max_request_
        q_next_ = i*max_request_
        query_ = query_base_.format(q_first_, q_next_, pool_id, time_start, time_end)
        query_result_ = run_query(query_)
        json_data_ = json.loads(query_result_.text)
#         print(query_)
        
        try:
            swap_arrays_ += json_data_['data']['swaps']
        except Exception:
            print('.. Pass')
            pass
        
    print(' ')
    print('\n Queried Swaps, total of {} datapoints'.format(str(len(swap_arrays_))))
    print('example:')
    print(swap_arrays_[0])
    
    # array to dataframe
    df_ = pd.json_normalize(swap_arrays_)
#     df_.drop_duplicates(subset=['id']) 
    
    return df_

### Test running

In [7]:
# Get token_id > Get pool_id > Get PoolDayDatas > Get swap data > Merge Swap data (VolumeUSD, txCount - for checking)

# Indicate Tokens and FeeTier
token0_id = get_token_id('USDC')
token1_id = get_token_id('WETH')
feeTier = '3000'
pool_id = get_pool_id(token0_id, token1_id, feeTier)

 
get_token_id: USDC
{'data': {'tokens': [{'id': '0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48', 'name': 'USD Coin', 'symbol': 'USDC'}]}}
 
get_token_id: WETH
{'data': {'tokens': [{'id': '0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2', 'name': 'Wrapped Ether', 'symbol': 'WETH'}]}}

 get_pool_id for feeTier: 3000
{'data': {'pools': [{'feeTier': '3000', 'id': '0x8ad599c3a0ff1de082011efddc58f1908eb6e6d8', 'token0': {'symbol': 'USDC'}, 'token1': {'symbol': 'WETH'}}]}}


In [10]:
# Get poolHourDatas
df_poolHourDatas = get_poolHourDatas(pool_id, num_datapoints=10000)


.. Pass
.. Pass
.. Pass
.. Pass
 

 Queried poolHourDatas, total of 1992 datapoints
example:
{'close': '2216.306267410391837112346737388033', 'feeGrowthGlobal0X128': '894428956562884715070851895944071', 'feeGrowthGlobal1X128': '358286707906318006638723640353639067903496', 'feesUSD': '0', 'high': '2216.306267410391837112346737388033', 'liquidity': '17758984097793142685', 'low': '2213.560299325294444354815279758329', 'open': '2213.560299325294444354815279758329', 'periodStartUnix': 1627376400, 'pool': {'token0': {'symbol': 'USDC'}, 'token1': {'symbol': 'WETH'}}, 'sqrtPrice': '1682924746153310698010032826405131', 'tick': '199284', 'token0Price': '2216.306267410391837112346737388033', 'token1Price': '0.0004512011786026460391748754724793668', 'tvlUSD': '282326301.3313681056108189699064874', 'txCount': '22', 'volumeToken0': '0', 'volumeToken1': '0', 'volumeUSD': '0'}


In [15]:
# Get Swap Datas within the poolHourDatas timeframe
time_start = df_poolHourDatas['periodStartUnix'][0]
time_end = df_poolHourDatas['periodStartUnix'][df_poolHourDatas.index[-1]]

df_swaps = get_swaps(pool_id, time_start, time_end, num_datapoints=10000)

.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
.. Pass
 

 Queried Swaps, total of 6000 datapoints
example:
{'amount0': '1685.106407', 'amount1': '-0.758982925002697166', 'amountUSD': '1682.577055177528500971562767982368', 'id': '0xd6e4a636fde9e1e95ced4d0fcb7a10c6d9d0e3bba5a8f8f74a36d1644859bc12#146727', 'sqrtPriceX96': '1683968274387083196451198826429933', 'tick': '199296', 'timestamp': '1627375226', 'transaction': {'blockNumber': '12907143', 'gasPrice': '17000000000', 'gasUsed': '215110', 'timestamp': '1627375226'}}


In [None]:
# Merge Data
FEATURES = [' ', ' ']

# Merge logic

# Check txCount

# Look to confirm data

In [16]:
df_swaps['id'].value_counts()

0x61f699a688bf2ba4447f7009f3742a262eccfceeda7b200fb39c5b2790d69b6b#138493    1
0x5a3969fdfdd07ecfa6298caeb22fa623815313b0f03d924d21aa454ca96ea5cb#142782    1
0x8b3c8d772813101cfbe6e4fec04f9076dbd47ad636df5e8952148c51176f3e21#140232    1
0x34467ccb6488fcec6cf7ab7528b8076483d52fe5225f47d99eaf4e97151e5319#139190    1
0xeb279d90923e3c4bace4c6a603d6cb369c15ca3393d46632ddc8c2ae7c29eb52#138229    1
                                                                            ..
0xdc1c111e0a42fa86eb1a1b9ef37445d3c18b21fee45b11cbea6ffcb253740f75#145599    1
0x7a420c6d9d3e50d26e4c1c0f278001ed70fef46ef60d8b934d9a544243d219bc#140953    1
0xb453a6991dbf6dd1f03e6af94d8af3f0ab311db3a27a2f4e32eb79ff624bfe51#140567    1
0xefbfb0d0461f8e40f2044646ef0d0f8b1f9869cb3c58d027824600c90b318c1c#139941    1
0xafe8e0187347924054d17adc4627328d877617b074a99f83a9a349acd98bcf54#136829    1
Name: id, Length: 6000, dtype: int64

In [17]:
df_swaps.head()

Unnamed: 0,amount0,amount1,amountUSD,id,sqrtPriceX96,tick,timestamp,transaction.blockNumber,transaction.gasPrice,transaction.gasUsed,transaction.timestamp
0,1685.106407,-0.7589829250026972,1682.5770551775283,0xd6e4a636fde9e1e95ced4d0fcb7a10c6d9d0e3bba5a8...,1683968274387083196451198826429933,199296,1627375226,12907143,17000000000,215110,1627375226
1,182282.111604,-82.11914563381555,181988.85120606847,0x22de3712dc3b291259acc72ccc27767ee6eaaf516a9a...,1683971665928123683914867073910701,199296,1627374010,12907044,1,1200000,1627374010
2,170784.144738,-76.9717924002703,170510.55113936088,0xfb9641a355ba02c3a41ee34fb5b7532f5b7af7e1313e...,1684339372528251680417922950418094,199301,1627374003,12907043,20321384573,300000,1627374003
3,266777.484182,-120.29869655117564,266334.80953569664,0xb8dc0e467f59c4d6a0d321e6c5ed9f971a221bebb2c3...,1684684030718139210655881053238285,199305,1627373992,12907042,21054772407,300000,1627373992
4,186639.467974,-84.20758970570738,186338.69635920567,0xaf8307f821ad65e91678c1058e7885fee66dad62a2f2...,1685222694684590014921396324201719,199311,1627373981,12907041,1,1200000,1627373981


In [19]:
df_poolHourDatas.to_csv('poolHourDatas-USDC-WETH-3000.csv')

##### Query tokens with symbol
{
  tokens(first:10, where:{symbol: "WETH"}) {
    id
    symbol
    name
  }
}

##### Query pools with token0 id,  token1 ids and feeTiers
{
  pools(first:10, 
    where:{token0:"0x6b175474e89094c44da98b954eedeac495271d0f",
    token1: "0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2",
    feeTier:"3000" }) 
  {
    id
    token0{symbol}
    token1{symbol}
    feeTier
  }
}

##### Query poolDayDatas with pool id, order by date - Needs to be iterative (max 1000 query)
{
  poolDayDatas(first:1000,
  next: 1000,
    where:{pool:"0xa80964c5bbd1a0e95777094420555fead1a26c1e"},
  orderBy:date,
  orderDirection: desc) 
  {
    date
    tick
    liquidity
    volumeUSD
  }
}





##### Query examples on filtering

{
  pools
  (first: 10, 
    where: {liquidity_gt: "1000000", 
      feeTier: "10000"}
    orderBy: liquidity, 
    orderDirection: desc)
  {
    token0{symbol}
    token1{symbol}
    liquidity
  }


(token0) DAI id = 0x6b175474e89094c44da98b954eedeac495271d0f
(token1) WETH id = 0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2
(feeTier) "3000"

(DAI-WETH 500) Pool id = 0x60594a405d53811d3bc4766596efd80fd545a270
(DAI-WETH 3000) Pool id = 0xc2e9f25be6257c210d7adf0d4cd6e3e881ba25f8
(DAI-WETH 1000) Pool id = 0xa80964c5bbd1a0e95777094420555fead1a26c1e


