# Introduction
This Jupyter Notebook aims to analyze user behavior across different lending protocols, focusing on how users interact with various tokens as collateral and debt. Specifically, we will investigate the looping behavior of users, where assets are borrowed on one protocol and then deposited as collateral in another protocol. This analysis will help us understand the extent and impact of such behaviors on the lending ecosystem.

# Objectives
### Load the Data

- We will load loan data for multiple lending protocols from Google Cloud Storage. The datasets contain detailed information about users, their collateral, and debt across different protocols.
- The data loading process will be implemented flexibly to allow easy switching between data sources (e.g., from cloud storage to a local database).

### Visualize User Behavior
- We will create visualizations to track the behavior of individual users across lending protocols, focusing on specific tokens such as "ETH", "wBTC", "USDC", "DAI", "USDT", "wstETH", "LORDS", "STRK", "UNO", and "ZEND".
- The visualizations will help answer several key questions:
  - How many users have borrowed an asset on one protocol and deposited the asset as collateral in another protocol?
  - How many users have completed a loop, i.e., deposited token X as collateral, borrowed token Y, deposited Y in another protocol, and borrowed X again?
  - What is the total dollar amount of tokens involved in these loops? How much are the deposits multiplied?
  - Which protocols are most subject to looping behavior? How do they compare on a per-token basis?

# Analysis and Insights
The analysis will not only address the predefined questions but also explore additional metrics and hypotheses that may arise during the investigation.
Meaningful outputs and insights will be provided, documenting the findings and their implications for the lending protocols.

# Loading the data

### From local Database

#### Postgres

In [None]:
import pandas as pd
import psycopg2
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display  # Only needed for Jupyter Notebook

# Connect to the PostgreSQL database
conn = psycopg2.connect(
    host='your_host',
    user='your_username',
    password='your_password',
    dbname='loans_db'
)

# List of protocols (table names in the PostgreSQL database)
protocols = ["zklend", "nostra_alpha", "nostra_mainnet", "hashstack_v0", "hashstack_v1"]

for protocol in protocols:
    print(f"Processing {protocol}...")
    
    # Query the data from the PostgreSQL database
    query = f"SELECT * FROM {protocol}"
    df = pd.read_sql_query(query, conn)
    
    pd.set_option('display.max_columns', None)  # Display all columns
    pd.set_option('display.max_colwidth', None)  # Display full column width
    pd.set_option('display.width', None)  # Adjust display width

    # Display the first rows
    display(df.head())

# Close the connection
conn.close()


#### MySQL

In [None]:
import pandas as pd
import mysql.connector
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display  # Only needed for Jupyter Notebook

# Connect to the MySQL database
conn = mysql.connector.connect(
    host='your_host',
    user='your_username',
    password='your_password',
    database='loans_db'
)

# List of protocols (table names in the MySQL database)
protocols = ["zklend", "nostra_alpha", "nostra_mainnet", "hashstack_v0", "hashstack_v1"]

for protocol in protocols:
    print(f"Processing {protocol}...")
    
    # Query the data from the MySQL database
    query = f"SELECT * FROM {protocol}"
    df = pd.read_sql_query(query, conn)
    
    pd.set_option('display.max_columns', None)  # Display all columns
    pd.set_option('display.max_colwidth', None)  # Display full column width
    pd.set_option('display.width', None)  # Adjust display width

    # Display the first rows
    display(df.head())

# Close the connection
conn.close()


### From GCS

In [None]:
import pandas as pd
import pyarrow.parquet as pq
import requests
import matplotlib.pyplot as plt
import seaborn as sns
from io import BytesIO

# URLs of the loans files for all lending protocols
parquet_urls = {
    "zklend": "https://storage.googleapis.com/derisk-persistent-state/zklend_data/loans.parquet",
    "nostra_alpha": "https://storage.googleapis.com/derisk-persistent-state/nostra_alpha_data/loans.parquet",
    "nostra_mainnet": "https://storage.googleapis.com/derisk-persistent-state/nostra_mainnet_data/loans.parquet",
    "hashstack_v0": "https://storage.googleapis.com/derisk-persistent-state/hashstack_v0_data/loans.parquet",
    "hashstack_v1": "https://storage.googleapis.com/derisk-persistent-state/hashstack_v1_data/loans.parquet"
}


for protocol,url in parquet_urls.items():
    # Download the file
    response = requests.get(url)
    response.raise_for_status()  # Ensure the request was successful

    # Read the Parquet file into a Pandas DataFrame
    with BytesIO(response.content) as f:
        table = pq.read_table(f)
        df = table.to_pandas()

    pd.set_option('display.max_columns', None)  # Display all columns
    pd.set_option('display.max_colwidth', None)  # Display full column width
    pd.set_option('display.width', None)  # Adjust display width


    # Display the first rows
    display(df.head())

