Logging for pandas and more.
There are functions specifically to log pandas dataframes as well as functions to count up the total number of errors and warnings.
- pandas: Tested on 0.20.3 and higher. May work for earlier versions.
For this example, import both the customlogger and pandas.
import redquill as rq
import pandas as pd
Initialize the logger. Note, the logger always logs to the console by default. You may inrqude a log directory and the log file name, which will allow the logs to flow into a file as well.
log = rq.redquill()
The following will log any null values found in df
.
df = pd.DataFrame({"A": [1, 2, None, 3],
"B": [4, 2, 2, 5]})
log.warn_null_values(df=df)
Console output.
2018-03-04 01:34:33,162 - 23512 - WARNING - test_redquill.test_warn_null_values -
A B
2 NaN 2
The following will log any duplicates found in column
B
of df
.
df = pd.DataFrame({"A": [1, 2, None, 3],
"B": [4, 2, 2, 5]})
log.warn_duplicate_values(df=df, subset="B", msg="Duplicates on B.")
Console output.
2018-03-04 02:29:41,419 - 31825 - WARNING - test_redquill.test_warn_duplicate_values - Duplicates on B.
A B
1 2.0 2
2 NaN 2