Skip to content

Yevgnen/cacheframe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents TOC

Introduction

Simple Pandas dataframe file cache.

Installation

From pip

pip install cacheframe

From source

pip install git+https://github.com/Yevgnen/cacheframe.git

Usage

The only provided function decorator is cacheframe.cacheframe with the following arguments:

  • cache_dir: directory to place cache files (default: .cache)
  • file: cache file name, support file types are: .csv, .xlxs, .pickle, .json, .parquet and .feather (default: dataframe.parquet)
  • read_kwds: optional keyword arguments passed to readers (pandas.to_*) when reading cache (default: None)
  • write_kwds: optional keyword arguments passed to writers (pandas.read_*) when writing cache (default: None)
  • ttl: optional TTL value to invalid cache (default: None)
  • disable: boolean indicator to enable or disable cache (default: False)

The wrapped function should return single dataframe.

Note that as Pandas does NOT install required engines for reading or writing specific file types, you maybe need to install them manually, e.g. pyarrow for .feather, fastparquet for .parquet, openpyxl for .xlsx

Example

# -*- coding: utf-8 -*-

import time

import pandas as pd
from pandas import DataFrame

from cacheframe import cacheframe


@cacheframe(cache_dir=".cache", file="dataframe.parquet", ttl=3)
def read_large_dataframe() -> DataFrame:
    print("Reading a very large dataframe...")
    time.sleep(2)

    df = pd.DataFrame([{"x": 1, "y": 2}, {"x": 99, "y": 100}])

    return df


if __name__ == "__main__":
    print("Read once...")
    df = read_large_dataframe()  # "Reading dataframe..."
    print(df)

    print("Read again...")
    df = read_large_dataframe()  # Cache is read
    print(df)

    print("Wait 5 seconds and read again...")
    time.sleep(5)
    df = read_large_dataframe()  # Cache expired, "Reading dataframe..."
    print(df)

Contribution

Formatting Code

To ensure the codebase complies with a style guide, please use flake8, black and isort tools to format and check codebase for compliance with PEP8.