<a href="https://colab.research.google.com/github/coatless/colab-notes/blob/main/03-dataframe-json-export-and-import-with-missing-and-date-values-prairielearn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# (Aside) Upgrade Pandas


Please make sure to update the version of Pandas to at least v1.5 or above.

In [24]:
!pip install pandas==1.5.*

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pandas==1.5.*
  Downloading pandas-1.5.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.2/12.2 MB[0m [31m61.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 1.3.5
    Uninstalling pandas-1.3.5:
      Successfully uninstalled pandas-1.3.5
Successfully installed pandas-1.5.2


In [13]:
!pip uninstall pandas -y

Found existing installation: pandas 1.5.2
Uninstalling pandas-1.5.2:
  Successfully uninstalled pandas-1.5.2


In [14]:
!pip install --pre --extra-index https://pypi.anaconda.org/scipy-wheels-nightly/simple pandas

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/, https://pypi.anaconda.org/scipy-wheels-nightly/simple
Collecting pandas
  Downloading https://pypi.anaconda.org/scipy-wheels-nightly/simple/pandas/2.0.0.dev0%2B1147.g7cb7592523/pandas-2.0.0.dev0%2B1147.g7cb7592523-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (47.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.1/47.1 MB[0m [31m12.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pandas
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
db-dtypes 1.0.5 requires pandas<2.0dev,>=0.24.2, but you have pandas 2.0.0.dev0+1147.g7cb7592523 which is incompatible.[0m[31m
[0mSuccessfully installed pandas-2.0.0.dev0+1147.g7cb7592523


In [2]:
import pandas as pd 

pd.__version__

'1.5.2'

# Overview

We're attempting to see if a new JSON output can be used within PrairieLearn that preserves missing data. 

## Example Data

Sample data

In [2]:
import pandas as pd 
import numpy as np

x = [
  {"city": "Champaign", "job":"Professor","age":35, 'time': pd.to_datetime('2022-10-06 12:00')},  
  {"city": "Sunnyvale", "job":"Driver","age":20, 'time': pd.to_datetime('2020-05-09 12:00')}, 
  {"city": "Mountain View", "job":"Data Scientist", "age":np.nan, 'time': pd.to_datetime('2021-12-14 12:00')}
]

df = pd.DataFrame(x)
display(df)

Unnamed: 0,city,job,age,time
0,Champaign,Professor,35.0,2022-10-06 12:00:00
1,Sunnyvale,Driver,20.0,2020-05-09 12:00:00
2,Mountain View,Data Scientist,,2021-12-14 12:00:00


## Conversion

Export into a dictionary structure using `orient="table"` and `date_format = "iso"` into a string with the appropriate JSON structure. 

In [12]:
import json 

encoded_json_df = df.to_json(orient = "table", date_format = "iso")

pl_wrapped_df = {'_type': 'dataframe-v2', '_value': encoded_json_df}

print(json.dumps(pl_wrapped_df, indent = 4))

{
    "_type": "dataframe-v2",
    "_value": "{\"schema\":{\"fields\":[{\"name\":\"index\",\"type\":\"integer\"},{\"name\":\"city\",\"type\":\"string\"},{\"name\":\"job\",\"type\":\"string\"},{\"name\":\"age\",\"type\":\"number\"},{\"name\":\"time\",\"type\":\"datetime\"}],\"primaryKey\":[\"index\"],\"pandas_version\":\"0.20.0\"},\"data\":[{\"index\":0,\"city\":\"Champaign\",\"job\":\"Professor\",\"age\":35.0,\"time\":\"2022-10-06T12:00:00.000Z\"},{\"index\":1,\"city\":\"Sunnyvale\",\"job\":\"Driver\",\"age\":20.0,\"time\":\"2020-05-09T12:00:00.000Z\"},{\"index\":2,\"city\":\"Mountain View\",\"job\":\"Data Scientist\",\"age\":null,\"time\":\"2021-12-14T12:00:00.000Z\"}]}"
}


Note: The `_value` contents is a string with escaped contents.

## Dump JSON to Disc 

Write JSON to disc:

In [9]:
import json

json.dump(pl_wrapped_df, open("df_export.json", "w"))

## Load JSON from Disc

Import the JSON dictionary with `json.load()`:

In [10]:
ingested_from_json = json.load(open("df_export.json", "r"))

## Convert JSON dictionary to Pandas

Re-create the dataframe using the `orient='table'` option:

In [11]:
df_recreated = pd.read_json(ingested_from_json['_value'], orient="table")
df_recreated

Unnamed: 0,city,job,age,time
0,Champaign,Professor,35.0,2022-10-06 12:00:00
1,Sunnyvale,Driver,20.0,2020-05-09 12:00:00
2,Mountain View,Data Scientist,,2021-12-14 12:00:00


## Check Reconstruction

Verify the original data frame matches with the recreated data frame:

In [12]:
df.equals(df_recreated)

True

## Proposed Conversion Function with String

The prior steps can be combined together into one function:

In [5]:
import json 

def json_encode_with_string(df):

  encoded_json_df = df.to_json(orient = "table", date_format = "iso")
  pl_wrapped_df = {'_type': 'dataframe-v2', '_value': encoded_json_df}
  json.dump(pl_wrapped_df, open("df_export.json", "w"))
  ingested_from_json = json.load(open("df_export.json", "r"))

  df_recreated = pd.read_json(ingested_from_json['_value'], orient="table")

  return df_recreated

json_encode_with_string(df)

Unnamed: 0,city,job,age,time
0,Champaign,Professor,35.0,2022-10-06 12:00:00
1,Sunnyvale,Driver,20.0,2020-05-09 12:00:00
2,Mountain View,Data Scientist,,2021-12-14 12:00:00


## Show data for GitHub

Generate table in markdown for GH issue

In [13]:
print(df.to_markdown())

|    | city          | job            |   age | time                |
|---:|:--------------|:---------------|------:|:--------------------|
|  0 | Champaign     | Professor      |    35 | 2022-10-06 12:00:00 |
|  1 | Sunnyvale     | Driver         |    20 | 2020-05-09 12:00:00 |
|  2 | Mountain View | Data Scientist |   nan | 2021-12-14 12:00:00 |


# Save and Load with valid JSON

One question arose from the initial proposal: Is it possible to _retain_ a proper JSON structure without embedding it inside of a string? 

The answer to that question is: Yes, but we need to convert it from a string into JSON during the serialization.

Here we use `json.loads()` to convert the exported Pandas JSON string into valid JSON.

In [11]:
import json 

encoded_json_df = df.to_json(orient = "table", date_format = "iso")

pl_wrapped_df = {'_type': 'dataframe-v2', '_value': json.loads(encoded_json_df)}

print(json.dumps(pl_wrapped_df, indent = 4))

{
    "_type": "dataframe-v2",
    "_value": {
        "schema": {
            "fields": [
                {
                    "name": "index",
                    "type": "integer"
                },
                {
                    "name": "city",
                    "type": "string"
                },
                {
                    "name": "job",
                    "type": "string"
                },
                {
                    "name": "age",
                    "type": "number"
                },
                {
                    "name": "time",
                    "type": "datetime"
                }
            ],
            "primaryKey": [
                "index"
            ],
            "pandas_version": "0.20.0"
        },
        "data": [
            {
                "index": 0,
                "city": "Champaign",
                "job": "Professor",
                "age": 35.0,
                "time": "2022-10-06T12:00:00.000Z"
            }

**Note:** The `_value` does not have a string surrounding the JSON.

When we want to use the JSON portion, we need to convert it back to a string.

In [47]:
json.dump(pl_wrapped_df, open("df_export_jobs.json", "w"))

ingested_from_json_string = json.load(open("df_export_jobs.json", "r"))

json_as_a_string = json.dumps(ingested_from_json_string["_value"])

pd.read_json(json_as_a_string, orient="table")

Unnamed: 0,city,job,age,time
0,Champaign,Professor,35.0,2022-10-06 12:00:00
1,Sunnyvale,Driver,20.0,2020-05-09 12:00:00
2,Mountain View,Data Scientist,,2021-12-14 12:00:00


## Proposed Conversion function with proper JSON output

In [8]:
import json 

def json_encode_proper(df):

  encoded_json_df = df.to_json(orient = "table", date_format = "iso")
  # Enforce JSON object with loads()
  pl_wrapped_df = {'_type': 'dataframe-v2', '_value': json.loads(encoded_json_df)}
  
  # PrairieLearn Export and Import Dance
  json.dump(pl_wrapped_df, open("df_export.json", "w"))
  ingested_from_json = json.load(open("df_export.json", "r"))

  # Enforce JSON as a string with .dumps()
  json_as_string = json.dumps(ingested_from_json['_value'])

  # Reconstruct DataFrame
  df_recreated = pd.read_json(json_as_string, orient="table")

  return df_recreated

json_encode_proper(df)

Unnamed: 0,city,job,age,time
0,Champaign,Professor,35.0,2022-10-06 12:00:00
1,Sunnyvale,Driver,20.0,2020-05-09 12:00:00
2,Mountain View,Data Scientist,,2021-12-14 12:00:00


# Problematic Index Header - BC Data Example

The breast cancer data set is problematic due to column variable names.

c.f. disucssions with [@eliotwrobson](https://github.com/eliotwrobson) in 

<https://github.com/PrairieLearn/PrairieLearn/issues/6501>

In [2]:
import pandas as pd 

# Link to raw data (outside of PL)
bc = "https://raw.githubusercontent.com/coatless/raw-data/main/breast-cancer-train.dat"

# Read into DataFrame
df = pd.read_csv(bc, header=None).head(3)

# View DF
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,22,23,24,25,26,27,28,29,30,31
0,842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758


## Problematic Column Index

We can see that the column names are all integers:

In [3]:
df.columns

Int64Index([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
            17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
           dtype='int64')

Unfortunately, JSON does not allow for an integer key. Instead they must be strings. 

### Option 1: Coerce Column Index to String

We can address the shortcomings by converting fully to a string:

In [17]:
import numpy as np 
df_modified_names = df.copy()

indexing_dtype = df_modified_names.columns.dtype
if indexing_dtype == np.float64 or indexing_dtype == np.int64:
  df_modified_names.columns = df_modified_names.columns.astype('string')

Each column then looks like:

In [18]:
df_modified_names.columns

Index(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12',
       '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24',
       '25', '26', '27', '28', '29', '30', '31'],
      dtype='object')

### Option 2: Coerce Column Index to Letter String

Alternatively, we could use `ascii_letters` to specify new column names.

In [16]:
import string
df_modified_names_abc = df.copy() 

df_modified_names_abc.columns = [i for i in string.ascii_letters[0:32]]

In [14]:
df_modified_names_abc.columns

Index(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
       'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B',
       'C', 'D', 'E', 'F'],
      dtype='object')

### Modified Column Index DataFrame

Going with a coercion to being string-based digits, we have:

In [7]:
encoded_string_json_df = df_modified_names.head(3).iloc[:, 0:5].to_json(orient = "table", date_format = "iso")
encoded_string_json_df

'{"schema":{"fields":[{"name":"index","type":"integer"},{"name":"0","type":"integer"},{"name":"1","type":"string"},{"name":"2","type":"number"},{"name":"3","type":"number"},{"name":"4","type":"number"}],"primaryKey":["index"],"pandas_version":"0.20.0"},"data":[{"index":0,"0":842302,"1":"M","2":17.99,"3":10.38,"4":122.8},{"index":1,"0":842517,"1":"M","2":20.57,"3":17.77,"4":132.9},{"index":2,"0":84300903,"1":"M","2":19.69,"3":21.25,"4":130.0}]}'

Under this representation, we have:

In [15]:
import json 
print(json.dumps(json.loads(encoded_string_json_df), indent = 4))

{
    "schema": {
        "fields": [
            {
                "name": "index",
                "type": "integer"
            },
            {
                "name": "0",
                "type": "integer"
            },
            {
                "name": "1",
                "type": "string"
            },
            {
                "name": "2",
                "type": "number"
            },
            {
                "name": "3",
                "type": "number"
            },
            {
                "name": "4",
                "type": "number"
            }
        ],
        "primaryKey": [
            "index"
        ],
        "pandas_version": "0.20.0"
    },
    "data": [
        {
            "index": 0,
            "0": 842302,
            "1": "M",
            "2": 17.99,
            "3": 10.38,
            "4": 122.8
        },
        {
            "index": 1,
            "0": 842517,
            "1": "M",
            "2": 20.57,
            "3": 17.7

Thus, we can re-construct the numeric-only data frame with:

In [10]:
pd.read_json(encoded_string_json_df, orient="table")

Unnamed: 0,0,1,2,3,4
0,842302,M,17.99,10.38,122.8
1,842517,M,20.57,17.77,132.9
2,84300903,M,19.69,21.25,130.0


## Encoding with String

In this example, we avoid coercing the string from `.to_json()` into a dictionary with the `json.loads()` function.

In [56]:
import json

pl_wrapped_df_string = {'_type': 'dataframe-v2', '_value': encoded_string_json_df}

json.dump(pl_wrapped_df_string, open("df_export_as_json_string.json", "w"))

ingested_from_json_string = json.load(open("df_export_as_json_string.json", "r"))

df_recreated_string = pd.read_json(ingested_from_json_string['_value'], orient="table")
df_recreated_string

Unnamed: 0,a,b,c,d,e
0,842302,M,17.99,10.38,122.8
1,842517,M,20.57,17.77,132.9
2,84300903,M,19.69,21.25,130.0


## Bad encoding with original JSON

In this example, we coerce the string from `.to_json()` into a dictionary with the `json.loads()` function.

In [40]:
import json

pl_wrapped_df_proper = {'_type': 'dataframe-v2', '_value': json.loads(encoded_string_json_df)}

json.dump(pl_wrapped_df_proper, open("df_export_as_json_proper.json", "w"))

ingested_from_json_proper = json.load(open("df_export_as_json_proper.json", "r"))

 As a result, we get a buffer error when we try to use `pd.read_json()`

```python
df_recreated_proper = pd.read_json(ingested_from_json_proper['_value'], orient="table")
```

```python
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-41-373712fc9f69> in <module>
----> 1 df_recreated_proper = pd.read_json(ingested_from_json_proper['_value'], orient="table")
      2 df_recreated_proper

6 frames
/usr/local/lib/python3.8/dist-packages/pandas/io/common.py in _get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode, storage_options)
    449     ):
    450         msg = f"Invalid file path or buffer object type: {type(filepath_or_buffer)}"
--> 451         raise ValueError(msg)
    452 
    453     return IOArgs(

ValueError: Invalid file path or buffer object type: <class 'dict'>
```

## Converting JSON back to String

The quickest fix is to convert the JSON records back into string form before it hits the `pd.read_json()` function.

In [37]:
val = json.dumps(ingested_from_json_proper["_value"])
df_recreated_proper = pd.read_json(val, orient="table")
df_recreated_proper

Unnamed: 0,a,b,c,d,e
0,842302,M,17.99,10.38,122.8
1,842517,M,20.57,17.77,132.9
2,84300903,M,19.69,21.25,130.0


# Complex Numbers

Sample set of complex numbers

In [None]:
# create a dataframe with imaginary numbers
df = pd.DataFrame({'a': [1 + 2j, 3 + 4j], 'b': [5 + 6j, 7 + 8j]})

Above approach put into a function with the _option_ to use different orientation types.

In [15]:
import pandas as pd
import json


def json_conversion(df, orient_type = "values"):

  print(df)
  # convert dataframe to a JSON string
  json_str = df.to_json(orient=orient_type)

  # write the JSON string to a file
  with open(f'data_{orient_type}.json', 'w') as f:
      json.dump(json_str, f)

  # read the JSON string from the file
  with open(f'data_{orient_type}.json', 'r') as f:
      json_str = json.load(f)

  # convert the JSON string back to a dataframe
  df2 = pd.read_json(json_str, orient=orient_type)

  # display the dataframe
  print(df2)

  return df2


## Test JSON Conversion with orient types

In [16]:
json_conversion(df, "values")

|    |   a |   b |
|---:|----:|----:|
|  0 |   1 |   5 |
|  1 |   3 |   7 |
          a         b
0  1.0+2.0j  5.0+6.0j
1  3.0+4.0j  7.0+8.0j
                            0                           1
0  {'imag': 2.0, 'real': 1.0}  {'imag': 6.0, 'real': 5.0}
1  {'imag': 4.0, 'real': 3.0}  {'imag': 8.0, 'real': 7.0}


  conv(string)
  return format(float(val), floatfmt)


Unnamed: 0,0,1
0,"{'imag': 2.0, 'real': 1.0}","{'imag': 6.0, 'real': 5.0}"
1,"{'imag': 4.0, 'real': 3.0}","{'imag': 8.0, 'real': 7.0}"


In [17]:
json_conversion(df, "table")

          a         b
0  1.0+2.0j  5.0+6.0j
1  3.0+4.0j  7.0+8.0j


TypeError: ignored

## View output JSON

In [9]:
!cat data_table.json

"{\"schema\":{\"fields\":[{\"name\":\"index\",\"type\":\"integer\"},{\"name\":\"a\",\"type\":\"number\"},{\"name\":\"b\",\"type\":\"number\"}],\"primaryKey\":[\"index\"],\"pandas_version\":\"1.4.0\"},\"data\":[{\"index\":0,\"a\":{\"imag\":2.0,\"real\":1.0},\"b\":{\"imag\":6.0,\"real\":5.0}},{\"index\":1,\"a\":{\"imag\":4.0,\"real\":3.0},\"b\":{\"imag\":8.0,\"real\":7.0}}]}"

In [8]:
!cat data_values.json

"[[{\"imag\":2.0,\"real\":1.0},{\"imag\":6.0,\"real\":5.0}],[{\"imag\":4.0,\"real\":3.0},{\"imag\":8.0,\"real\":7.0}]]"

## Prior Approach

In [20]:
# create a dataframe with imaginary numbers
df = pd.DataFrame({'a': [1 + 2j, 3 + 4j], 'b': [5 + 6j, 7 + 8j]})

v = df.copy()

json_df = {
  "_type": "dataframe",
  "_value": {
      "index": list(v.index),
      "columns": list(v.columns),
      "data": v.values.tolist(),
  },
}

val = json_df["_value"]

rebuild_df = pd.DataFrame(
      index=val["index"], columns=val["columns"], data=val["data"]
)

rebuild_df

Unnamed: 0,a,b
0,1.0+2.0j,5.0+6.0j
1,3.0+4.0j,7.0+8.0j


# Pandas Data Types

Explore conversion with more data types

In [25]:
import pandas as pd 
import numpy as np 

dft = pd.DataFrame(
    {
        # Scalars
        "integer": 1,
        "numeric": 3.14,
        "logical": False,
        "character": "foo",
        #"complex": complex(1, 2),
        # Series
        "numeric-list": pd.Series([1.0] * 3).astype("float32"),
        "integer-list": pd.Series([1] * 3, dtype="int8"),
        #"complex-list": pd.Series(np.array([1, 2, 3]) + np.array([4, 5, 6]) *1j).astype("complex128"),
        "character-list": pd.Series(["hello", "world", "stat"]),
        "logical-list": pd.Series([True, False, True]),
        "character-string-list": pd.Series(["a", "b", "c"], dtype="string"),
        # Time Dependency: https://pandas.pydata.org/docs/user_guide/timeseries.html
        "POSIXct-POSIXt-timestamp": pd.Timestamp("20230102"),
        "POSIXct-POSIXt-date_range": pd.date_range("2023", freq="D", periods=3),
        #"POSIXct-POSIXt-period": pd.period_range("1/1/2011", freq="M", periods=3), # Not supported in rpy2
        #"POSIXct-POSIXt-timedelta": pd.to_timedelta(np.arange(3), unit="s"), # Not supported in rpy2
        # Categorical: https://pandas.pydata.org/docs/user_guide/categorical.html
        "factor": pd.Categorical(["a", "b", "c"], ordered=False),
        "ordered-factor": pd.Categorical(["a", "b", "c"], categories=["a", "b", "c"], ordered=True),
    }
)

dft

Unnamed: 0,integer,numeric,logical,character,numeric-list,integer-list,character-list,logical-list,character-string-list,POSIXct-POSIXt-timestamp,POSIXct-POSIXt-date_range,factor,ordered-factor
0,1,3.14,False,foo,1.0,1,hello,True,a,2023-01-02,2023-01-01,a,a
1,1,3.14,False,foo,1.0,1,world,False,b,2023-01-02,2023-01-02,b,b
2,1,3.14,False,foo,1.0,1,stat,True,c,2023-01-02,2023-01-03,c,c


In [37]:
json_conversion(dft, "table")

   integer  numeric  logical character  numeric-list  integer-list  \
0        1     3.14    False       foo           1.0             1   
1        1     3.14    False       foo           1.0             1   
2        1     3.14    False       foo           1.0             1   

  character-list  logical-list character-string-list POSIXct-POSIXt-timestamp  \
0          hello          True                     a               2023-01-02   
1          world         False                     b               2023-01-02   
2           stat          True                     c               2023-01-02   

  POSIXct-POSIXt-date_range factor ordered-factor  
0                2023-01-01      a              a  
1                2023-01-02      b              b  
2                2023-01-03      c              c  
   integer  numeric  logical character  numeric-list  integer-list  \
0        1     3.14    False       foo           1.0             1   
1        1     3.14    False       foo        

Unnamed: 0,integer,numeric,logical,character,numeric-list,integer-list,character-list,logical-list,character-string-list,POSIXct-POSIXt-timestamp,POSIXct-POSIXt-date_range,factor,ordered-factor
0,1,3.14,False,foo,1.0,1,hello,True,a,2023-01-02,2023-01-01,a,a
1,1,3.14,False,foo,1.0,1,world,False,b,2023-01-02,2023-01-02,b,b
2,1,3.14,False,foo,1.0,1,stat,True,c,2023-01-02,2023-01-03,c,c


## Debug Requested Version Information

In [2]:
import pandas as pd 
pd.show_versions()


INSTALLED VERSIONS
------------------
commit           : 8dab54d6573f7186ff0c3b6364d5e4dd635ff3e7
python           : 3.8.16.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.10.147+
Version          : #1 SMP Sat Dec 10 16:00:40 UTC 2022
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.5.2
numpy            : 1.21.6
pytz             : 2022.7
dateutil         : 2.8.2
setuptools       : 57.4.0
pip              : 22.0.4
Cython           : 0.29.32
pytest           : 3.6.4
hypothesis       : None
sphinx           : 3.5.4
blosc            : None
feather          : 0.4.1
xlsxwriter       : None
lxml.etree       : 4.9.2
html5lib         : 1.0.1
pymysql          : None
psycopg2         : 2.9.5
jinja2           : 2.11.3
IPython          : 7.9.0
pandas_datareader: 0.9.0
bs4              : 4.6.3
bottleneck       : None
brotli           : 