# Capital Allocation Model - Data Science Coding Challenge

## Background

Our trading strategy operates across various trading instruments. Among those trading instruments, the cryptocurrency futures market is a significant component. For this test, assume that our strategy is applied to four different cryptocurrencies. You will be provided with profit and loss (PnL) data for each of these cryptocurrencies.

## Objective

Develop a capital allocation strategy (e.g., a neural network, a statistical approach, etc.) that optimally allocate capital among different cryptocurrencies to maximizes the total returns.


## Instructions

1. **Data Description**:

   - Each dataset `{crypto}.csv` contains hourly PnL data for a cryptocurrency.
   - Columns in the dataset:
     - `ts_hour`: Timestamp of the PnL data, incremented by one hour.
     - `pnl`: PnL of our strategy for the cryptocurrency.

2. **Your Task**:

   - Develop a strategy to allocate capital among the four cryptocurrencies to maximize cumulative returns.

   - Possible allocations for an hour include:
     - Equal allocation (e.g., 25% each).
     - Biased allocation (e.g., 80% to one cryptocurrency, 20% distributed among others).
     - Zero allocation to any or all cryptocurrencies.

3. **Assumptions**:

   Due to the time limit to solve the test, let's simplify the problem by assuming:

   - You can rebalance your allocations as frequently as you wish with no additional trading cost.
   - Allocation changes are immediate with no delay.

4. **Deliverables**:

   - A Jupyter notebook containing:
     - Your capital allocation strategy.
     - A brief explanation of your approach.
   - A `requirements.txt` file listing all the necessary packages to run your notebook.
   - Other necessary files such as model weights, etc.

5. **Evaluation Criteria**:

   We will evaluate the effectiveness of your model using a separate test dataset. Your strategy will be evaluated based on the following criteria:

   - **Accuracy**: How well does your strategy maximize returns?
   - **Clarity**: Is your code well-documented and your approach clearly explained?

6. **Testing Your Results**:
   - You should provide a function `generate_allocation_prediction` that takes a list of historical PnL data of the four cryptocurrencies as input and outputs the allocation for the next hour.
   - We will use the `evaluate_strategy` function that is provided to you in the notebook to test the effectiveness of your strategy on a separate test dataset.

## Implementation Constraints

This coding challenge should be completed within 3 working days (24 working hours) from the time you receive it. If you are working on it part-time, it should be completed within 7 working days.

Please note that the time constraint is not a strict deadline. You do not need to stress about submitting at the last minute. However, we expect you to complete the challenge within this approximate timeframe.

Once you have completed the challenge, please send your submission to `codingchallenges@eonlabs.com`.

### Email Submission Guidelines

- **Subject Line**: `Data Scientist Coding Challenge Submission - [Your Full Name]`
- **Email Body**:
  - Attach a zip file containing all your deliverables (Jupyter notebook, `requirements.txt`, and any other necessary files). Please ensure the zip file is named as `[Your Full Name]_submission.zip`.

If you have any questions, please feel free to reach out to `codingchallenges+questions@eonlabs.com`.


## Data Loading Example
Here is an example of how to read the data:

In [None]:
import pandas as pd

cryptos = ["BTC", "ETH", "ADA", "DOGE"]
# Read the CSV file into a pandas DataFrame
pnl_df_list = [pd.read_csv(f'data/{crypto}.csv') for crypto in cryptos]

# Print the head of each DataFrame to verify the data
for crypto, df in zip(cryptos, pnl_df_list):
    print(df.head())
    print("\n")

## Example Allocation Function
Here is an example template for the allocation function:

In [3]:
def generate_allocation_prediction(pnl_df_list):
    """ Generate allocation plan for the future hour, based on the current and historical pnl data """
    # In this example, we assume equal allocation to each of the cryptocurrencies
    allocation = [1/len(pnl_df_list)] * len(pnl_df_list)
    return allocation


## Evaluation Criteria
Your `generate_allocation_prediction` function will be evaluated using the following method:

1. The function's input is a list of DataFrames, each containing historical PnL data for a specific cryptocurrency.
2. The function should return a list of allocation percentages for each cryptocurrency.
3. The sum of the allocation percentages should not exceed 100%.
4. The function will be tested using the `evaluate_strategy` function with our interval data as shown below.


## Strategy Evaluation



In [None]:
def evaluate_strategy(pnl_df_list):
    # Initialize cumulative returns
    cumulative_return = 1

    # Generate the date range for the evaluation period
    date_range = pd.date_range(start='2023-06-01 00:00:00', end='2024-06-01 00:00:00', freq='H')
    date_strings = date_range.strftime('%Y-%m-%dT%H:%M:%S').tolist()
    
    # Preprocess data to avoid filtering in the loop
    pnl_dict = {crypto: df.set_index('ts_hour')['pnl'] for crypto, df in zip(cryptos, pnl_df_list)}

    for i in range(len(date_strings) - 1):
        date_string = date_strings[i]
        next_date_string = date_strings[i + 1]
        
        # Prepare data for the current date
        pnl_df_list_for_date = [pnl_dict[crypto].loc[:date_string] for crypto in cryptos]
        
        # Get the next hour PnL values
        next_hour_pnl_list = [pnl_dict[crypto].get(next_date_string, 0) for crypto in cryptos]
        
        # Get the allocation predictions
        allocation = generate_allocation_prediction(pnl_df_list_for_date)
        
        if len(allocation) != len(pnl_df_list):
            raise ValueError("Allocation length does not match the number of cryptocurrencies")
        if sum(allocation) > 1:
            raise ValueError("Sum of allocation is greater than 1")

        # Calculate period return
        period_return = sum(allocation[i] * next_hour_pnl_list[i] for i in range(len(allocation)))
        cumulative_return = cumulative_return * (1 + period_return)
    
    return cumulative_return
    

cryptos = ["BTC", "ETH", "ADA", "DOGE"]
pnl_df_list = [pd.read_csv(f'test_data/{crypto}.csv') for crypto in cryptos]
cumulative_return = evaluate_strategy(pnl_df_list)
print(cumulative_return)

## Start your work here
You can start by exploring the data and then proceed to develop your capital allocation strategy. We wish you the best of luck and look forward to seeing your innovative solutions!