# Retrieve most recent EOD positions from Quote Media and store in h5 format

## Try to match YuWei's approach exactly

Yes, the code you provided **does create an HDF5 file that maps `symbol` to `sid`**. Here's how it works and how the mapping is stored:

### Key Parts of the Code That Create the Symbol-to-SID Mapping

1. **Loading Symbols Data and Creating SID Mapping** (`load_symbols()` function):
   - The function `load_symbols(tickers)` reads ticker data from the CSV file `QUOTEMEDIA_TICKERS_6d75499fefd916e54334b292986eafcc.csv`.
   - It then creates a new index column called `'sid'`, which serves as a unique identifier for each symbol (essentially mapping symbols to a numerical `sid`).

   ```python
   def load_symbols(tickers):
       df = pd.read_csv(custom_data_path / ticker_data_name)
       return (df[df.ticker.isin(tickers)]
               .reset_index(drop=True)
               .reset_index()
               .rename(columns={'index': 'sid'}))
   ```

   - The **symbols DataFrame** that is returned contains the following columns: 
     - **`sid`**: A unique identifier for each symbol (starting from 0 and incrementing).
     - **`ticker`**: The ticker symbol (e.g., AAPL, MSFT, etc.).
   - This DataFrame is then saved to the HDF5 file under the key `'equities'`:

   ```python
   symbols.to_hdf(custom_data_path / 'quandl.h5', 'equities', format='t')
   ```

2. **How the SID Mapping Is Used**:
   - The mapping between `symbol` and `sid` is stored in the HDF5 file under the key `'equities'`.
   - Later, the code uses this mapping to create a **SID column** in the prices data. This ensures that the price data stored in the HDF5 file is associated with the correct `sid`.
   
   ```python
   sid_map = symbols[['ticker', 'sid']].set_index('ticker')['sid'].to_dict()
   temp = prices.reset_index()
   prices['sid'] = temp['ticker'].map(sid_map).values
   ```

3. **Saving Prices Data to HDF5 File**:
   - The function `save_prices_to_hdf()` is used to save the price data for each symbol in the HDF5 file under the path `'prices/{sid}'`.
   - The **`sid`** is used as the identifier for each symbol's price data.

   ```python
   def save_prices_to_hdf(sid, symbol, prices, custom_data_path):
       p = prices.loc[symbol, ['adj_open', 'adj_high', 'adj_low', 'adj_close', 'adj_volume']]
       p.index = p.index.values

       # Optimize data types
       p = p.astype({
           'adj_open': 'float32',
           'adj_high': 'float32',
           'adj_low': 'float32',
           'adj_close': 'float32',
           'adj_volume': 'int32'
       })

       # Save to HDF5
       p.to_hdf(custom_data_path / 'quandl.h5', 'prices/{}'.format(sid), format='t')
   ```

4. **Summary of HDF5 File Structure**:
   - **`quandl.h5`** is the name of the HDF5 file that contains the data.
   - The structure of this HDF5 file is as follows:
     - **`/equities`**: This table contains metadata for each symbol, specifically mapping each symbol to a unique `sid`. The columns are:
       - **`sid`**: Unique identifier for each symbol.
       - **`ticker`**: The ticker symbol (e.g., "AAPL", "GOOG").
     - **`/prices/{sid}`**: For each `sid`, there is a separate group containing the price data. This data includes:
       - **`adj_open`**: Adjusted open price.
       - **`adj_high`**: Adjusted high price.
       - **`adj_low`**: Adjusted low price.
       - **`adj_close`**: Adjusted close price.
       - **`adj_volume`**: Adjusted volume.

5. **Visual Representation of the HDF5 File**:
   - **`quandl.h5`**
     - **`/equities`** (Table that contains symbol metadata):
       | **sid** | **ticker** |
       |---------|------------|
       | 0       | AAPL       |
       | 1       | MSFT       |
       | ...     | ...        |
     
     - **`/prices/0`** (Price data for symbol with `sid=0`, e.g., AAPL):
       | **Date**       | **adj_open** | **adj_high** | **adj_low** | **adj_close** | **adj_volume** |
       |----------------|--------------|--------------|-------------|--------------|----------------|
       | 2024-01-01     | 100.25       | 102.00       | 99.50       | 101.75       | 1,500,000      |
       | 2024-01-02     | 101.50       | 103.20       | 100.75      | 102.85       | 1,200,000      |
       | ...            | ...          | ...          | ...         | ...          | ...            |

### Summary:
- **HDF5 File Content**:
  - The **symbol to sid mapping** is stored in an HDF5 table under the key **`/equities`**.
  - Each `sid` has corresponding **price data** stored under a separate key (`/prices/{sid}`), making it easy to look up all the price data for a particular symbol.
  
- **Why Is This Useful?**:
  - This organization allows efficient querying of both **metadata** and **historical price data** for a symbol.
  - By storing prices under `sid`, the file remains well-structured, and you can easily access or update data for specific securities without ambiguity.

So yes, your code **does maintain an HDF5 file to map symbols to sids** in the `equities` group, and uses `sid` to organize and store corresponding price data under separate keys for each `sid`.

In [None]:
import pandas as pd

with pd.HDFStore(h5_path, mode='r') as store:
    print(store.keys())


# Production Version

In [None]:
import pandas as pd
import numpy as np
import requests
import io
import zipfile
import time
from datetime import datetime
from pathlib import Path
from zipline.utils.calendar_utils import get_calendar
from tqdm import tqdm
import warnings
from tables import NaturalNameWarning
warnings.filterwarnings('ignore', category=NaturalNameWarning)


# Set up API key and base URL
API_KEY = "tw2sxkKZo_y1UvMcnSux"
BASE_URL = "https://data.nasdaq.com/api/v3/datatables/QUOTEMEDIA"

# Define file paths
zipline_root = '~/repos/edge-seeker/zipline-x/.zipline'
custom_data_path = Path(zipline_root, 'custom_data')
h5_path = custom_data_path / 'quotemedia_eod_data_v3.h5'

# Function to download data from API
def get_data(endpoint, params):
    url = f"{BASE_URL}/{endpoint}"
    params["api_key"] = API_KEY
    response = requests.get(url, params=params)
    if response.status_code == 200:
        return response.json()
    else:
        raise Exception(f"API request failed with status code {response.status_code}")

# Function to download and process ZIP file
def download_and_process_zip(url):
    response = requests.get(url)
    if response.status_code == 200:
        print(f'response: {response}')
        z = zipfile.ZipFile(io.BytesIO(response.content))
        csv_filename = z.namelist()[0]  # Assume the first file in the ZIP is the CSV we want
        with z.open(csv_filename) as f:
            df = pd.read_csv(f)
        return df
    else:
        raise Exception(f"Failed to download ZIP file. Status code: {response.status_code}")

# Download tickers
def download_tickers():
    tickers_response = get_data("TICKERS", {"qopts.export": "true"})
    print(f'tickers response: {tickers_response}')
    tickers_download_link = tickers_response['datatable_bulk_download']['file']['link']
    tickers_df = download_and_process_zip(tickers_download_link)
    return tickers_df

# Download adjusted EOD price data
def download_prices(start_date, end_date):
    params = {
        "date.gte": start_date,
        "date.lte": end_date,
        "qopts.columns": "ticker,date,adj_open,adj_high,adj_low,adj_close,adj_volume",
        "qopts.export": "true"
    }
    price_response = get_data("PRICES", params)
    price_download_link = price_response['datatable_bulk_download']['file']['link']
    prices_df = download_and_process_zip(price_download_link)
    return prices_df

# Reindex prices by trading days
def reindex_trading_days(prices_df, trading_days):
    def reindex_group(group):
        # Sort the group by 'date' to ensure the index is monotonic
        group = group.sort_index()  # This ensures the index is properly sorted
        group = group.droplevel('symbol')
        return group.reindex(trading_days, method='ffill')

    # Apply reindexing to each group
    new_df = prices_df.groupby(level='symbol').apply(reindex_group)

    # Return the updated DataFrame
    return new_df

# Main function to create HDF5
def create_hdf5():
    # Set start and end dates
    start_date = "2008-01-01"
    #start_date = "2024-08-01"
    end_date = datetime.now().strftime("%Y-%m-%d")

    # Download data
    print("Downloading tickers...")
    tickers_df = download_tickers()

    print("Downloading adjusted EOD price data...")
    time.sleep(10)  # Preventing rate-limiting
    prices_df = download_prices(start_date, end_date)

    # Process tickers data
    tickers_df.rename(columns={'ticker': 'symbol'}, inplace=True)
    tickers_df.reset_index(inplace=True, drop=True)
    tickers_df.reset_index(inplace=True)
    tickers_df.rename(columns={'index': 'sid'}, inplace=True)

    # Modify sid to be a valid Python identifier
    #tickers_df['sid'] = 's' + tickers_df['sid'].astype(str)

    # Process prices data
    prices_df["date"] = pd.to_datetime(prices_df["date"])
    prices_df.rename(columns={'ticker':'symbol','adj_open': 'open', 'adj_high': 'high', 'adj_low': 'low', 'adj_close': 'close', 'adj_volume': 'volume'}, inplace=True)
    prices_df.set_index(['symbol', 'date'], inplace=True)

    # Filter by trading days
    nyse_calendar = get_calendar('NYSE')
    trading_days = nyse_calendar.sessions_in_range(start=start_date, end=end_date)
    prices_df = prices_df[prices_df.index.get_level_values('date').isin(trading_days)]

    # Reindex to fill missing trading days
    prices_df = reindex_trading_days(prices_df, trading_days)
    prices_df.index.set_names(['symbol', 'date'], inplace=True)
    prices_df = prices_df.fillna(method='ffill').fillna(method='bfill')

    # Ensure no missing symbols between tickers and prices
    tickers_in_prices = prices_df.index.get_level_values('symbol').unique()
    tickers_df = tickers_df[tickers_df['symbol'].isin(tickers_in_prices)]
    prices_df = prices_df[prices_df.index.get_level_values('symbol').isin(tickers_df['symbol'])]

    # Create sid mapping
    sid_map = tickers_df[['symbol', 'sid']].set_index('symbol')['sid'].to_dict()
    prices_df = prices_df.reset_index()
    prices_df['sid'] = prices_df['symbol'].map(sid_map)
    prices_df.set_index(['sid', 'date'], inplace=True)

    # Store data in HDF5
    # Store prices data for each sid
    with pd.HDFStore(h5_path, mode='w') as store:
        # Store tickers data
        store.put('equities', tickers_df, format='table')

        # Store prices data for each sid
        for sid, group in tqdm(prices_df.groupby(level='sid'), desc="Saving prices to HDF5"):
            sid_data = group.droplevel('sid')[['open', 'high', 'low', 'close', 'volume']]
            sid_data = sid_data.astype({
                'open': 'float32',
                'high': 'float32',
                'low': 'float32',
                'close': 'float32',
                'volume': 'int32'
            })
            store.put(f'prices/{sid}', sid_data, format='table')


    print("Data successfully stored in HDF5 format.")

if __name__ == '__main__':
    create_hdf5()


## Really helpful code to confirm the column in the h5 file

In [None]:
import pandas as pd
from pathlib import Path

# Define file paths
zipline_root = '~/repos/edge-seeker/zipline-x/.zipline'
custom_data_path = Path(zipline_root, 'custom_data')
h5_path = custom_data_path / 'quotemedia_eod_data_v3.h5'

# Open the HDF5 file to check its content
with pd.HDFStore(h5_path, mode='r') as store:
    # Display all keys in the HDF5 file
    print("\nList of keys in the HDF5 file:")
    print(store.keys())

    # Check for 'equities' table
    if '/equities' in store:
        print("\n>>> 'equities' table found. Checking its structure...")
        equities_df = store['equities']
        print(equities_df.info())
        print(equities_df.head())

        # Expected columns: ['sid', 'symbol', ...]
        if 'sid' in equities_df.columns and 'symbol' in equities_df.columns:
            print("\n'equities' table has the correct columns.")
        else:
            print("\nWarning: 'equities' table is missing expected columns.")

    else:
        print("\nError: 'equities' table not found in the HDF5 file.")

    # Check for 'prices/{sid}' datasets
    price_keys = [key for key in store.keys() if key.startswith('/prices/')]
    if price_keys:
        print(f"\n>>> Found {len(price_keys)} price tables in the HDF5 file. Sample keys:")
        print(price_keys[:5])  # Print the first 5 keys for inspection

        # Check the structure of one of the price datasets
        sample_key = price_keys[0]
        print(f"\n>>> Checking structure of the sample price dataset: {sample_key}")
        sample_prices = store[sample_key]
        print(sample_prices.info())
        print(sample_prices.head())

        # Expected columns: ['open', 'high', 'low', 'close', 'volume']
        expected_columns = ['open', 'high', 'low', 'close', 'volume']
        if all(col in sample_prices.columns for col in expected_columns):
            print("\nSample price dataset has the correct columns.")
        else:
            print("\nWarning: Sample price dataset is missing expected columns.")
    else:
        print("\nError: No price tables found in the HDF5 file under '/prices/'.")

print("\nData file verification completed.")


## Really helpful code to confirm the date range of the h5 file

In [1]:
import pandas as pd
from pathlib import Path

# Define file paths
zipline_root = '~/repos/edge-seeker/zipline-x/.zipline'
custom_data_path = Path(zipline_root, 'custom_data')
h5_path = custom_data_path / 'quotemedia_eod_data_v3.h5'

# Open the HDF5 file to check its content
with pd.HDFStore(h5_path, mode='r') as store:
    # List all keys in the HDF5 file
    print("\nList of keys in the HDF5 file:")
    keys = store.keys()
    print(keys)

    # Filter keys to find all price datasets
    price_keys = [key for key in keys if key.startswith('/prices/')]
    
    if not price_keys:
        print("\nError: No price tables found in the HDF5 file under '/prices/'.")
    else:
        # Check the structure of each price dataset
        print(f"\n>>> Checking date index for {len(price_keys)} price datasets...")

        issues_found = False
        for key in price_keys[:5]:  # Limiting the check to 5 samples for efficiency
            print(f"\n>>> Checking price dataset: {key}")
            prices_df = store[key]
            
            # Check if 'date' is the index
            if prices_df.index.name != 'date':
                print(f"Warning: Dataset {key} does not have 'date' as the index.")
                issues_found = True
            else:
                # Check if index is a DatetimeIndex
                if not isinstance(prices_df.index, pd.DatetimeIndex):
                    print(f"Warning: Dataset {key} does not have a DatetimeIndex for the 'date' index.")
                    issues_found = True
                else:
                    print(f"Dataset {key} has 'date' as a proper DatetimeIndex.")

            # Display a summary of the dataset for inspection
            print(prices_df.info())
            print(prices_df.head())

        if not issues_found:
            print("\nAll checked price datasets have the correct 'date' index format.")
        else:
            print("\nIssues found in the 'date' index of some price datasets. Please review the warnings above.")

print("\nDate index verification completed.")



List of keys in the HDF5 file:
['/equities', '/prices/10005', '/prices/10006', '/prices/10007', '/prices/10009', '/prices/10010', '/prices/10011', '/prices/10013', '/prices/10014', '/prices/10015', '/prices/10016', '/prices/10018', '/prices/10019', '/prices/10020', '/prices/10021', '/prices/10022', '/prices/10023', '/prices/10024', '/prices/10025', '/prices/10026', '/prices/10027', '/prices/10028', '/prices/10029', '/prices/10030', '/prices/10031', '/prices/10032', '/prices/10033', '/prices/10035', '/prices/10036', '/prices/10037', '/prices/10038', '/prices/10040', '/prices/10041', '/prices/10043', '/prices/10044', '/prices/10047', '/prices/10052', '/prices/10053', '/prices/10062', '/prices/10063', '/prices/10064', '/prices/10065', '/prices/10066', '/prices/10067', '/prices/10068', '/prices/10069', '/prices/10070', '/prices/10071', '/prices/10072', '/prices/10073', '/prices/10074', '/prices/10075', '/prices/10077', '/prices/10078', '/prices/10079', '/prices/10080', '/prices/10081', '/

# Check to make sure h5 has data for all the symbols it should

In [4]:
import pandas as pd
from zipline.utils.calendar_utils import get_calendar
from pathlib import Path

# Define file paths
zipline_root = '~/repos/edge-seeker/zipline-x/.zipline'
custom_data_path = Path(zipline_root, 'custom_data')
h5_path = custom_data_path / 'quotemedia_eod_data_v3.h5'

# Load NYSE trading calendar
nyse_calendar = get_calendar('NYSE')
start_date = "2024-08-01"  # Adjust as needed
end_date = pd.Timestamp.now().strftime("%Y-%m-%d")  # Adjust as needed
trading_days = nyse_calendar.sessions_in_range(start=start_date, end=end_date)

# Open the HDF5 file to check price data
with pd.HDFStore(h5_path, mode='r') as store:
    # Get the list of all available 'prices/{sid}' keys
    price_keys = [key for key in store.keys() if key.startswith('/prices/')]

    if not price_keys:
        print("\nError: No price tables found in the HDF5 file under '/prices/'.")
    else:
        # Check each price dataset to ensure completeness
        print(f"\n>>> Validating price data completeness for {len(price_keys)} price datasets...")

        all_data_complete = True

        for key in price_keys:
            sid = key.split('/')[-1]  # Extract sid from the key
            prices_df = store[key]

            # Check if all trading days are present in the dataset
            missing_dates = set(trading_days) - set(prices_df.index)

            if missing_dates:
                all_data_complete = False
                print(f"Warning: Dataset {key} for sid {sid} is missing the following dates: {sorted(missing_dates)}")

            # Optionally, show a summary if data is missing or mismatches
            if missing_dates:
                print(f"Summary for {key}:")
                print(prices_df.info())
                print(prices_df.head())

        if all_data_complete:
            print("\nAll price datasets have complete data for all expected NYSE trading days.")
        else:
            print("\nIssues were found in the completeness of some price datasets. Please review the warnings above.")

print("\nPrice data validation completed.")



>>> Validating price data completeness for 10677 price datasets...

All price datasets have complete data for all expected NYSE trading days.

Price data validation completed.


## Combined download and preprocessing code.

<b>The prices table is not in the correct format for the quandl_custom_bundle code though

In [3]:
#Same code but storing prices in different hierarchy
import pandas as pd
import numpy as np
import requests
import io
import zipfile
import time
from datetime import datetime
from pathlib import Path
from zipline.utils.calendar_utils import get_calendar
from tqdm import tqdm
import warnings
from tables import NaturalNameWarning
warnings.filterwarnings('ignore', category=NaturalNameWarning)


# Set up API key and base URL
API_KEY = "tw2sxkKZo_y1UvMcnSux"
BASE_URL = "https://data.nasdaq.com/api/v3/datatables/QUOTEMEDIA"

# Define file paths
zipline_root = '~/repos/edge-seeker/zipline-x/.zipline'
custom_data_path = Path(zipline_root, 'custom_data')
h5_path = custom_data_path / 'quotemedia_eod_data_v1.h5'

# Function to download data from API
def get_data(endpoint, params):
    url = f"{BASE_URL}/{endpoint}"
    params["api_key"] = API_KEY
    response = requests.get(url, params=params)
    if response.status_code == 200:
        return response.json()
    else:
        raise Exception(f"API request failed with status code {response.status_code}")

# Function to download and process ZIP file
def download_and_process_zip(url):
    response = requests.get(url)
    if response.status_code == 200:
        print(f'response: {response}')
        z = zipfile.ZipFile(io.BytesIO(response.content))
        csv_filename = z.namelist()[0]  # Assume the first file in the ZIP is the CSV we want
        with z.open(csv_filename) as f:
            df = pd.read_csv(f)
        return df
    else:
        raise Exception(f"Failed to download ZIP file. Status code: {response.status_code}")

# Download tickers
def download_tickers():
    tickers_response = get_data("TICKERS", {"qopts.export": "true"})
    print(f'tickers response: {tickers_response}')
    tickers_download_link = tickers_response['datatable_bulk_download']['file']['link']
    tickers_df = download_and_process_zip(tickers_download_link)
    return tickers_df

# Download adjusted EOD price data
def download_prices(start_date, end_date):
    params = {
        "date.gte": start_date,
        "date.lte": end_date,
        "qopts.columns": "ticker,date,adj_open,adj_high,adj_low,adj_close,adj_volume",
        "qopts.export": "true"
    }
    price_response = get_data("PRICES", params)
    price_download_link = price_response['datatable_bulk_download']['file']['link']
    prices_df = download_and_process_zip(price_download_link)
    return prices_df

# # Reindex prices by trading days
# def reindex_trading_days(prices_df, trading_days):
#     def reindex_group(group):
#         group = group.droplevel('ticker')
#         return group.reindex(trading_days, method='ffill')
#     return prices_df.groupby(level='ticker').apply(reindex_group)
# Reindex prices by trading days
def reindex_trading_days(prices_df, trading_days):
    def reindex_group(group):
        # Sort the group by 'date' to ensure the index is monotonic
        group = group.sort_index()  # This ensures the index is properly sorted
        group = group.droplevel('ticker')
        return group.reindex(trading_days, method='ffill')

    # Apply reindexing to each group
    new_df = prices_df.groupby(level='ticker').apply(reindex_group)

    # Return the updated DataFrame
    return new_df

# Main function to create HDF5
def create_hdf5():
    # Set start and end dates
    #start_date = "2008-01-10"
    start_date = "2024-08-01"
    end_date = datetime.now().strftime("%Y-%m-%d")

    # Download data
    print("Downloading tickers...")
    tickers_df = download_tickers()

    print("Downloading adjusted EOD price data...")
    time.sleep(10)  # Preventing rate-limiting
    prices_df = download_prices(start_date, end_date)

    # Process tickers data
    tickers_df.rename(columns={'ticker': 'symbol'}, inplace=True)
    tickers_df.reset_index(inplace=True, drop=True)
    tickers_df.reset_index(inplace=True)
    tickers_df.rename(columns={'index': 'sid'}, inplace=True)

    # Modify sid to be a valid Python identifier
    tickers_df['sid'] = 's' + tickers_df['sid'].astype(str)

    # Process prices data
    prices_df["date"] = pd.to_datetime(prices_df["date"])
    prices_df.rename(columns={"adj_open": 'open', 'adj_high': 'high', 'adj_low': 'low', 'adj_close': 'close', 'adj_volume': 'volume'}, inplace=True)
    prices_df.set_index(['ticker', 'date'], inplace=True)

    # Filter by trading days
    nyse_calendar = get_calendar('NYSE')
    trading_days = nyse_calendar.sessions_in_range(start=start_date, end=end_date)
    prices_df = prices_df[prices_df.index.get_level_values('date').isin(trading_days)]

    # Reindex to fill missing trading days
    prices_df = reindex_trading_days(prices_df, trading_days)
    prices_df.index.set_names(['ticker', 'date'], inplace=True)
    prices_df = prices_df.fillna(method='ffill').fillna(method='bfill')

    # Ensure no missing symbols between tickers and prices
    tickers_in_prices = prices_df.index.get_level_values('ticker').unique()
    tickers_df = tickers_df[tickers_df['symbol'].isin(tickers_in_prices)]
    prices_df = prices_df[prices_df.index.get_level_values('ticker').isin(tickers_df['symbol'])]

    # Create sid mapping
    sid_map = tickers_df[['symbol', 'sid']].set_index('symbol')['sid'].to_dict()
    prices_df = prices_df.reset_index()
    prices_df['sid'] = prices_df['ticker'].map(sid_map)
    prices_df.set_index(['sid', 'date'], inplace=True)

    # Store data in HDF5
    # Store prices data for each sid
    with pd.HDFStore(h5_path, mode='w') as store:
        # Store tickers data
        store.put('equities', tickers_df, format='table')

        # Store prices data for each sid
        for sid, group in tqdm(prices_df.groupby(level='sid'), desc="Saving prices to HDF5"):
            sid_data = group.droplevel('sid')[['open', 'high', 'low', 'close', 'volume']]
            sid_data = sid_data.astype({
                'open': 'float32',
                'high': 'float32',
                'low': 'float32',
                'close': 'float32',
                'volume': 'int32'
            })
            store.put(f'prices/{sid}', sid_data, format='table')


    print("Data successfully stored in HDF5 format.")

if __name__ == '__main__':
    create_hdf5()


Downloading tickers...
tickers response: {'datatable_bulk_download': {'file': {'link': 'https://aws-gis-link-pro-us-east-1-datahub.s3.amazonaws.com/export/QUOTEMEDIA/TICKERS/QUOTEMEDIA_TICKERS_6d75499fefd916e54334b292986eafcc.zip?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAX5EW3SB5DCZB2WCI%2F20241021%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241021T014556Z&X-Amz-Expires=1800&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEBoaCXVzLWVhc3QtMSJHMEUCIBw2NhtPnWPr3AYRFYBV8m21iELvL9JsoStU9E7SvtdlAiEAhFsSlxdh0Gxh%2BoIZfYcUEx3%2BpqXS9jN0majz5ZzFmBQqlQUIg%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FARACGgw1NDM2Mjk3NDIyMDIiDHFyy5rO2K4R%2FdtcQyrpBHoQ6p1uw2sEYY%2BQWWdgYuq0iRgMvzSM%2BtA8z4Ilw%2Bu4C8KqOoc1kqxqxXMI2Dd865No3wD5O2GTApU%2FWWRPzfQVrPijXenwBMUsQdTACNBku1jHauoicoJXhCtQHZeI%2FGOiQEwJoamb3eX%2Fdx5cI8gKx%2BXsT7l7%2BpmDs6vPOujguP5Ij7Wk%2FmTyI03SIIK4QR5czwbztyqwZENYDSzghHn7l7DDSyBmBMcHFgVGybzHHF3P%2FJdqnfjbqJ7cO8onwP%2FGi2e0C6nx5TmP4iXxiSnJMos89zfptRa%2FOnUxcTcktlPkaUJUzWxj6aPiJ%2FjDUx5AZkbA8UXNC76YYdffYY

  prices_df = prices_df.fillna(method='ffill').fillna(method='bfill')
Saving prices to HDF5:  50%|█████     | 5378/10677 [00:40<00:39, 133.17it/s]


KeyboardInterrupt: 

In [None]:
prices_df

In [None]:
tickers_df