# Adi Gohain 
# Term Project 
# MSDS Financial Engineering:

### LOADING EXTERNAL LIBRARIES

In the section below, I load all external libraries required for the trading system.

- **NumPy**: vectorized numerical operations, especially for portfolio simulations.
- **Pandas**: time series handling, financial data structures, percent-change returns, rolling windows, etc.
- **Matplotlib & Seaborn**: visualization of strategy performance, risk metrics, and Monte Carlo results.
- **yfinance**: real-time historical price downloads for stocks, ETFs, and futures.
- **os & datetime**: directory management and date formatting for saved outputs.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import yfinance as yf
import os
from datetime import datetime

### INGESTION OF DATA

Here I am defining the global tickers for my commodities, equities, and benchmarks.
I did this in order to make make the system modular — changing tickers requires editing only these blocks.
I also decided to create directories OUTPUT_DIR and CURVES_DIR to create a clean file structure for project output, figures, and logs. This project can be run easily from any root folder and the code will automatically save project files by creating these new folders.

I am also including the following ETF's in my mix to add diversification. Adding these uncorrelated ETF's with the four industry assets allows me to improve portfolio performance. 

Here is the reasoning behind picking these ETF's: (These ETF's were suggested by Chat GPT)

Asset Classes With Low or Negative Correlation to Industrials
Long-Duration U.S. Treasuries (defensive macro hedge)
- 20+ Year U.S. Treasuries **TLT**
- 7–10 Year Treasuries **IEF**
  
Commodities and Real Assets (Energy & Metals/ Non Agricultural) 
- Gold **GLD**
- Broad Commodities (energy-weighted) **DBC**

International Diversification / Developed Markets (non-U.S.)
- Developed Markets ex-US **EFA**
- Vanguard Developed ex-US **VEA**

Market-Neutral / Low-Beta / Risk-Managed Strategies
- Anti-Beta (long low beta, short high beta) **BTAL** 
- Multi-strategy, alt-risk premia **LALT**

In [2]:
# Here I am using 4 common agricultural commodities corn, soybeans, cotton and wheat
COMMODITY_TICKERS = {"Corn": "ZC=F",
                     "Soybeans": "ZS=F",
                     "Cotton": "CT=F",
                     "Wheat": "ZW=F"}

# Here I am using 4 equities: John Deere, Caterpillar, AgCo Corporation and CNH Industrial

EQUITY_TICKERS = {"DE": "DE",
                  "CAT": "CAT",
                  "AGCO": "AGCO",
                  "CNH": "CNH",
                  "20+ Year U.S. Treasuries":"TLT", 
                  "7–10 Year Treasuries":"IEF",
                  "Gold":"GLD",
                  "Broad Commodities (energy-weighted)":"DBC",
                  "Developed Markets ex-US":"EFA",
                  "Vanguard Developed ex-US":"VEA",
                  "Anti-Beta (long low beta, short high beta)":"BTAL", 
                  "Multi-strategy, alt-risk premia":"LALT"}

BENCHMARK_TICKERS = {"SPY": "SPY",
                     "QQQ": "QQQ",
                     "BND": "BND"}

# Assigning historical data start date to a variable START so it can be changed anytime and cascades downstream.
START = "2000-01-01"

OUTPUT_DIR = "output"

CURVES_DIR = os.path.join(OUTPUT_DIR, "curves")

os.makedirs(CURVES_DIR, exist_ok=True)

Below, the download_prices function first downloads historical price data from Yahoo Finance for the group of tickers assigned above.
I was running into various issues where tickers had varying timelines as well as dataframes had format errors and therefore to mitigate this, I added logic to handles multi index futures data from Yahoo (ex. "Adj Close" nested under the ticker symbol).

I used Chat GPT to develop a logic to resolve:
- Skipping unavailable tickers instead of crashing the pipeline.
- Using adjusted close prices, which reflect corporate actions (splits, dividends).
- Ensuring all time indices are converted to DatetimeIndex, avoiding alignment bugs later.
- Collecting all valid series into a dictionary and concatenating them horizontally, to create a clean, consistent price matrix where each column corresponds to a single asset.
- Raising an error to prevent any failures if no data is successfully retrieved.

In [3]:
def download_prices(ticker_dict, start=START):
    prices = {}
    for name, ticker in ticker_dict.items():
        print(f"Downloading {name} ({ticker})...")
        try:
            data = yf.download(ticker, start=start, auto_adjust=False, progress=False)
            if data is None or data.empty:
                print(f"Warning: no data available for {ticker}. Therefore, skipping {ticker}.")
                continue

            # If multi index futures, then select ('Adj Close', ticker)
            if isinstance(data.columns, pd.MultiIndex):
                try:
                    series = data["Adj Close"][ticker].dropna()
                except Exception:
                    print(f"Warning: Multi index but can not extract Adjusted Close for {ticker}. Therefore, skipping {ticker}.")
                    continue
            else:
                if "Adj Close" not in data.columns:
                    print(f"Warning: 'Adj Close' is missing for {ticker}. Therefore, skipping {ticker}.")
                    continue
                series = data["Adj Close"].dropna()

            if series.empty:
                print(f"Warning: Adjusted close is empty for {ticker}. Therefore, skipping {ticker}.")
                continue

            # ensuring datetime index
            series.index = pd.to_datetime(series.index)
            prices[name] = series

        except Exception as e:
            print(f"  Error downloading {ticker}: {e}")
            continue

    if len(prices) == 0:
        raise ValueError("No valid tickers downloaded.")

    df = pd.concat(prices, axis=1).sort_index()
    return df

### ALIGNING DATA

Because financial assets had different trading calendars like holidays, futures roll dates, missing history, etc..

I incorporated this function to synchronize the commodity futures, equity time series, and my benchmark ETFs.

This made sure that all datasets shared a common calendar using index intersection.
Forward filling helped in handling short gaps like holidays or missing futures data.
Rows with all missing data are being dropped to preserve numerical integrity.

This was an important process to add because without it, the vectorized operations (weights × returns) were getting misaligned and producing incorrect results.

In [4]:
def align_data(comm, eq, benchmarks=None):
    # Using intersection to ensure exact alignment
    if benchmarks is None:
        shared = comm.index.intersection(eq.index)
    else:
        shared = comm.index.intersection(eq.index).intersection(benchmarks.index)
        comm_a = comm.reindex(shared).ffill().dropna(how='all')
        eq_a = eq.reindex(shared).ffill().dropna(how='all')
    
    if benchmarks is not None:
        bench_a = benchmarks.reindex(shared).ffill().dropna(how='all')
    else:
        bench_a = None
        shared = comm_a.index.intersection(eq_a.index)
    
    if bench_a is not None:
        shared = shared.intersection(bench_a.index)
        comm_a = comm_a.loc[shared]
        eq_a = eq_a.loc[shared]
    
    if bench_a is not None:
        bench_a = bench_a.loc[shared]

    print(f"Aligned rows: {len(shared)} ({shared[0].date()} to {shared[-1].date()})")
    return comm_a, eq_a, bench_a

### SIGNAL GENERATION

In my model, I use momentum signals from commodities.

A commodity is “bullish” when its price closes above its moving average.
I am combining multiple commodities to create a broad commodity risk-on signal.
I am doing so by incorporating two functions here:
1. **compute_ma_signal()**

This one computes a binary indicator:
1 if price is above its moving average
0 is price is below
This represents a short term commodity momentum.

2. **aggregate_signal()**

This one combines all commodities by summing signals:
If at least N commodities are bullish, the model enters equities.
By doing so I am reducing noise and avoiding overreacting to single commodity volatility.

Because commodities are often leading indicators for industrial/equipment sector performance.
I chose to use this rule to captures macroeconomic demand conditions without relying on equity prices directly.

3. **apply_persistence_and_hold()**

I added this function to implement realistic trading constraints to my raw signals:
*min_consec_days* requires consecutive bullish days before entering a position. This is done in order to filters out whipsaw signals.
*min_hold_days* enforces a mandatory holding period once a position is entered therefore preventing overtrading and reducing transaction costs. This mimics institutional trading mandates where churn is expensive.
Internally, a finite-state loop steps day-by-day to extend a buy signal across required holding windows.
This produces a more stable and realistic signal series for backtesting.

In [5]:
def compute_ma_signal(price, window):
    ma = price.rolling(window, min_periods=1).mean()
    return (price > ma).astype(int)

def aggregate_signal(comm_df, ma_window=20, threshold=2):
    sigs = {}
    for col in comm_df.columns:
        sigs[col] = compute_ma_signal(comm_df[col], ma_window)
    sig_df = pd.DataFrame(sigs)
    sig_df['sum'] = sig_df.sum(axis=1)
    sig_df['raw_signal'] = (sig_df['sum'] >= threshold).astype(int)
    return sig_df

def apply_persistence_and_hold(raw_signal, min_consec_days=0, min_hold_days=0):
    s = raw_signal.astype(int).copy()
    if min_consec_days > 0:
        rsum = s.rolling(min_consec_days).sum()
        s = (rsum >= min_consec_days).astype(int)
    if min_hold_days > 0:
        out = pd.Series(0, index=s.index)
        i = 0
        idx = list(s.index)
        n = len(idx)
        while i < n:
            if s.iloc[i] == 1:
                hold_end = min(i + min_hold_days, n)
                j = i
                while j < n and (j < hold_end or s.iloc[j] == 1):
                    out.iloc[j] = 1
                    j += 1
                i = j
            else:
                i += 1
        return out
    return s

### BACKTESTING

Below, the backtesting process is my core module where I am translates my trading signals into daily portfolio returns.

I designed it to performs the following steps:
1. First computing the daily equity returns using percent change.
2. Then Lagging the signal by a day for preventing lookahead bias.
3. When signal equals 1, the algorithm invest equally across all equities
4. When signal = 0, the algorithm holds 0% exposure.

For estimating transaction costs realistically, I am calculating turnover by measuring how much the portfolio changes day to day.

Transaction cost model being applied daily to produce net returns.
- Costs = turnover × (commission + slippage)

Gross and net cumulative equity curves are being calculated via cumulative products.

The output df then includes:
gross_ret
net_ret
turnover
pos (exposure)
gross_eq
net_eq

I am doing this primarily to prepares my results for easy plotting, analyzing, and exporting.

In [6]:
def compute_strategy_returns(eq_prices, signal, commission=0.0005, slippage=0.0005):
    eq_ret = eq_prices.pct_change().fillna(0)
    pos = signal.shift(1).fillna(0)  # lagged signal
    n_assets = eq_prices.shape[1]
    # ensuring positive index matches eq_prices
    pos = pos.reindex(eq_prices.index).fillna(0)
    pos_df = pd.DataFrame(np.outer(pos, np.ones(n_assets)),
                          index=eq_prices.index,
                          columns=eq_prices.columns)
    
    weights = pos_df / n_assets
    
    prev_weights = weights.shift(1).fillna(0)
    
    turnover = (weights - prev_weights).abs().sum(axis=1) / 2.0
    
    cost = turnover * (commission + slippage)
    
    gross_ret = (weights * eq_ret).sum(axis=1)
    
    net_ret = gross_ret - cost
    
    df = pd.DataFrame({"gross_ret": gross_ret,
                       "net_ret": net_ret,
                       "turnover": turnover,
                       "pos": pos})
    
    df["gross_eq"] = (1 + df["gross_ret"]).cumprod()
    df["net_eq"] = (1 + df["net_ret"]).cumprod()
    return df

### EVALUATION METRICS

Here I am summarizing the strategy performance into key evaluation metrics:
1. Annualized return: compounded daily net returns.
2. Annualized volatility: daily standard deviation × √252.
3. Sharpe ratios: risk-adjusted performance indicator.
4. Average turnover: measure of trading intensity.
5. Total trades: days when turnover > 0.
6. Days in market: proxy for exposure and signal frequency.
7. Final net equity: ending value of $1 invested.

Using these metrics, I am comparing strategy variants, MA windows, or parameter choices.

In [7]:
def compute_metrics(res):
    
    mean = res["net_ret"].mean()
   
    vol = res["net_ret"].std()
    
    # Annualized return
    ann_ret = (1 + mean) ** 252 - 1
    # Annualized volatility
    ann_vol = vol * np.sqrt(252)
    # Sharpe Ratios
    sharpe = (mean / vol * np.sqrt(252)) if vol > 0 else np.nan
    
    return {"Ann_Return": ann_ret,
            "Ann_Vol": ann_vol,
            "Sharpe": sharpe,
            "Avg_Turnover": res["turnover"].mean(),
            "Total_Trades": (res["turnover"] > 0).sum(),
            "Days_In_Market": res["pos"].sum(),
            "Final_Net_Equity": res["net_eq"].iloc[-1]}

### VISUALIZATION

In [8]:
# Equity Curve Plot: To show gross vs. net cumulative returns over time.
def plot_strategy_equity(res, name="strategy_equity.png"):
    plt.figure(figsize=(10,6))
    plt.plot(res["gross_eq"], label="Gross")
    plt.plot(res["net_eq"], label="Net")
    plt.title("Strategy Equity Curve")
    plt.xlabel("Date"); plt.ylabel("Cumulative Return")
    plt.legend(); plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name)); plt.close()

In [9]:
# Benchmark Comparison Plot: To shows strategy performance against SPY/QQQ/BND.
def plot_equity_vs_benchmarks(res, benchmarks_df, name="eq_vs_benchmarks.png"):
    plt.figure(figsize=(10,6))
    plt.plot(res["net_eq"], label="Strategy (Net)")
    # building benchmark cum returns from adj close
    for col in benchmarks_df.columns:
        bret = benchmarks_df[col].pct_change().fillna(0)
        bcum = (1 + bret).cumprod()
        plt.plot(bcum, label=col)
    plt.title("Strategy vs Benchmarks")
    plt.xlabel("Date"); plt.ylabel("Cumulative Return")
    plt.legend(); plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name)); plt.close()

In [10]:
# Cumulative Exposure: To show how often the strategy is being invested.. for interpreting Sharpe and turnover.
def plot_cumulative_exposure(res, name="cumulative_exposure.png"):
    cum_exposure = res["pos"].cumsum()
    plt.figure(figsize=(10,4))
    plt.plot(cum_exposure, label="Cumulative Days in Market")
    plt.title("Cumulative Exposure (Days in Market)")
    plt.xlabel("Date"); plt.ylabel("Days"); plt.legend()
    plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name)); plt.close()

In [11]:
# Daily Returns & Distribution: Shows cumulative returns, and also plots histogram of daily returns with KDE. I wanted 
# To do this as it was useful for understanding risk, skewness, and tail behavior.
def plot_daily_returns_and_hist(res, name_ts="daily_returns_ts.png", name_hist="daily_returns_hist.png"):
    plt.figure(figsize=(12,4))
    plt.plot(res["net_ret"].cumsum(), label="Cumulative Net Returns (for reference)")
    plt.title("Cumulative Sum of Daily Net Returns")
    plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name_ts)); plt.close()
    
    plt.figure(figsize=(8,4))
    sns.histplot(res["net_ret"].dropna(), bins=80, kde=True)
    plt.title("Histogram of Daily Net Returns")
    plt.xlabel("Daily Net Return"); plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name_hist)); plt.close()

In [12]:
# Commodity Moving Average Charts: Each commodity is being plotted with its moving average windows, showing signal rationale.
def plot_commodity_ma(comm_df, ma_windows=[20,40], name_prefix="commodity_ma"):
    for col in comm_df.columns:
        plt.figure(figsize=(12,5))
        plt.plot(comm_df[col], label=f"{col} Price")
        for w in ma_windows:
            ma = comm_df[col].rolling(w).mean()
            plt.plot(ma, label=f"{w}-day MA")
        plt.title(f"{col} Price and Moving Averages")
        plt.legend(); plt.tight_layout()
        plt.savefig(os.path.join(CURVES_DIR, f"{name_prefix}_{col}.png")); plt.close()

In [13]:
# Correlation Heatmap: Shows correlations between equities to check for diversification.
def plot_correlation_heatmap(eq_prices, name="correlation_heatmap.png"):
    rets = eq_prices.pct_change().dropna()
    corr = rets.corr()
    plt.figure(figsize=(10,10))
    sns.heatmap(corr, annot=True, fmt=".2f", cmap="coolwarm")
    plt.title("Equity Correlation Matrix")
    plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name)); plt.close()

In [14]:
# Rolling Volatility: To measures risk changes over time.
def plot_rolling_volatility(eq_prices, window=63, name="rolling_vol.png"):
    rets = eq_prices.pct_change().dropna()
    port = rets.mean(axis=1)  # simple average portfolio for illustration
    roll_vol = port.rolling(window).std() * np.sqrt(252)
    plt.figure(figsize=(10,4))
    plt.plot(roll_vol)
    plt.title(f"Rolling Annualized Volatility (window={window})")
    plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name)); plt.close()

### MONTE CARLO SIMULATION

I optimized the Programming_Assignment_02 using chat GPT to bring in the Monte Carlo simulation as a module in this project. 
I am running 5000 random portfolio simulations using annualized mean returns and annualized covariance matrix

Similar to the assignment, I am evaluating 2 cases:
1. Long-only portfolios
2. Long + short allowed (market-neutral) portfolios

For each simulated weight vector, I am computing and storing the expected annual return, expected volatility, and Sharpe ratios

I am also adding a scatter plot to visualize the comparison between long-only vs long–short allowed sets.

In [15]:
def monte_carlo_portfolios(returns_df, n_portfolios=5000, long_only=True, seed=42):
    np.random.seed(seed)
    rets = returns_df.dropna().pct_change().dropna()
    
    mu = rets.mean() * 252
    sigma = rets.cov() * 252
    
    results = []
    for i in range(n_portfolios):
        if long_only:
            w = np.random.random(len(rets.columns))
            w /= w.sum()
        else:
            w = np.random.normal(0,1,len(rets.columns))
            # scale to have sum zero (allow leverage)
            if np.sum(np.abs(w)) > 0:
                w = w / np.sum(np.abs(w))
        port_ret = np.dot(w, mu)
        port_var = np.dot(w.T, np.dot(sigma, w))
        port_vol = np.sqrt(port_var)
        results.append({
            "ret": port_ret,
            "vol": port_vol,
            "sharpe": (port_ret / port_vol if port_vol>0 else np.nan),
            "weights": w
        })
    return pd.DataFrame(results)

In [16]:
def plot_monte_carlo_scatter(mc_df_long, mc_df_short, name="monte_carlo.png"):
    plt.figure(figsize=(10,6))
    plt.scatter(mc_df_long['vol'], mc_df_long['ret'], c=mc_df_long['sharpe'], cmap='viridis', s=8, label='Long-only')
    plt.scatter(mc_df_short['vol'], mc_df_short['ret'], c=mc_df_short['sharpe'], cmap='plasma', s=8, marker='x', label='Long+Short')
    plt.colorbar(label='Sharpe')
    plt.xlabel("Annualized Volatility")
    plt.ylabel("Annualized Return")
    plt.title("Monte Carlo Portfolio Simulation (return vs volatility)")
    plt.legend()
    plt.tight_layout()
    plt.savefig(os.path.join(CURVES_DIR, name)); plt.close()

### RUNNING PIPELINE

In the cell below, I am calling all the functions I created above to run my pipeline and then I am also outputting my data into a results df.

In [17]:
def run_full_pipeline(start=START,
                      ma_windows=[20,40],
                      threshold=2,
                      min_consec_days=0,
                      min_hold_days=0,
                      commission=0.0005,
                      slippage=0.0005,
                      n_mc=4000):
    # downloading
    comm = download_prices(COMMODITY_TICKERS, start=start)
    eq = download_prices(EQUITY_TICKERS, start=start)
    benches = download_prices(BENCHMARK_TICKERS, start=start)

    # aligning
    comm_a, eq_a, benches_a = align_data(comm, eq, benches)

    # signal using primary MA (20) and alt MA (40) if needed.
    # Here we produce both and run the pipeline with each MA.
    results_summary = []

    for ma in ma_windows:
        sig_df = aggregate_signal(comm_a, ma_window=ma, threshold=threshold)
        adj_sig = apply_persistence_and_hold(sig_df['raw_signal'],
                                             min_consec_days=min_consec_days,
                                             min_hold_days=min_hold_days)

        # make sure adj_sig index matches eq_a
        adj_sig = adj_sig.reindex(eq_a.index).fillna(0)

        res = compute_strategy_returns(eq_a, adj_sig, commission=commission, slippage=slippage)
        metrics = compute_metrics(res)
        metrics.update({
            "MA": ma,
            "MinConsec": min_consec_days,
            "MinHold": min_hold_days,
            "Commission": commission,
            "Slippage": slippage
        })
        results_summary.append(metrics)

        # Saving daily results
        fname = f"results_ma{ma}_c{min_consec_days}_h{min_hold_days}.csv"
        res.to_csv(os.path.join(OUTPUT_DIR, fname))

        # Ploting
        plot_strategy_equity(res, name=f"strategy_equity_ma{ma}.png")
        if benches_a is not None:
            plot_equity_vs_benchmarks(res, benches_a, name=f"eq_vs_benchmarks_ma{ma}.png")
        plot_cumulative_exposure(res, name=f"cumulative_exposure_ma{ma}.png")
        plot_daily_returns_and_hist(res,
                                    name_ts=f"daily_returns_cumsum_ma{ma}.png",
                                    name_hist=f"daily_returns_hist_ma{ma}.png")
        plot_commodity_ma(comm_a, ma_windows=[ma], name_prefix=f"commodity_ma{ma}")
        plot_correlation_heatmap(eq_a, name=f"corr_heatmap_ma{ma}.png")
        plot_rolling_volatility(eq_a, window=63, name=f"rolling_vol_ma{ma}.png")

    # Running Monte Carlo on equities
    mc_long = monte_carlo_portfolios(eq_a, n_portfolios=n_mc, long_only=True)
    mc_short = monte_carlo_portfolios(eq_a, n_portfolios=n_mc, long_only=False)
    mc_long.to_csv(os.path.join(OUTPUT_DIR, "monte_carlo_long.csv"), index=False)
    mc_short.to_csv(os.path.join(OUTPUT_DIR, "monte_carlo_longshort.csv"), index=False)
    plot_monte_carlo_scatter(mc_long, mc_short, name="monte_carlo_scatter.png")

    # Saving summary
    summary_df = pd.DataFrame(results_summary)
    summary_df.to_csv(os.path.join(OUTPUT_DIR, "experiment_summary_full.csv"), index=False)
    print("Pipeline complete. Outputs saved to", OUTPUT_DIR)
    return summary_df

Below, when running as a standalone Python script, the model can be run by defining these factors:

- Moving average windows of 20 and 40 days
- Threshold of 2 bullish commodities
- Minimum hold of 3 days
- Realistic trading costs
- A 3000-sample Monte Carlo run

I am using this structure to allow me to run it both as an importable module and a self contained executable script.

In [18]:
if __name__ == "__main__":
    # Example run — tweak parameters as desired
    summary = run_full_pipeline(
        start=START,
        ma_windows=[20, 40],
        threshold=2,
        min_consec_days=0,
        min_hold_days=3,   # small minimum hold to reduce whipsaw
        commission=0.0005,
        slippage=0.0005,
        n_mc=3000
    )
    print(summary)

Downloading Corn (ZC=F)...
Downloading Soybeans (ZS=F)...
Downloading Cotton (CT=F)...
Downloading Wheat (ZW=F)...
Downloading DE (DE)...
Downloading CAT (CAT)...
Downloading AGCO (AGCO)...
Downloading CNH (CNH)...
Downloading 20+ Year U.S. Treasuries (TLT)...
Downloading 7–10 Year Treasuries (IEF)...
Downloading Gold (GLD)...
Downloading Broad Commodities (energy-weighted) (DBC)...
Downloading Developed Markets ex-US (EFA)...
Downloading Vanguard Developed ex-US (VEA)...
Downloading Anti-Beta (long low beta, short high beta) (BTAL)...
Downloading Multi-strategy, alt-risk premia (LALT)...
Downloading SPY (SPY)...
Downloading QQQ (QQQ)...
Downloading BND (BND)...
Aligned rows: 6507 (2000-01-03 to 2025-11-18)
Pipeline complete. Outputs saved to output
   Ann_Return   Ann_Vol    Sharpe  Avg_Turnover  Total_Trades  Days_In_Market  \
0    0.054190  0.086166  0.612520      0.060012           781          4227.0   
1    0.046002  0.082932  0.542363      0.044798           583          4157.0 