You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running pm_stability_error on float columns with large values triggers (in some cases) Assertion Error.
For example running following code:
import pandas as pd
import numpy as np
import popmon
np.random.seed(1)
n = 1000
start_date = pd.to_datetime("2022-01-01")
example = pd.DataFrame({
"dt": [start_date + pd.DateOffset(i//100) for i in range(n)],
"a": (np.random.rand(n) - 0.5) * 10**4
})
example.loc[len(example)//2, 'a'] *= 10**4
example.pm_stability_report(time_axis="dt", time_width="1w")
Gives following output:
% python popmon_bug.py
.../.virtualenvs/random/lib/python3.7/site-packages/histogrammar/dfinterface/make_histograms.py:172: UserWarning: time-axis "dt" already found in binning specifications. not overwriting.
f'time-axis "{time_axis}" already found in binning specifications. not overwriting.'
2022-08-12 14:14:19,649 INFO [histogram_filler_base]: Filling 1 specified histograms. auto-binning.
100%|████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 463.15it/s]
2022-08-12 14:14:19,652 INFO [hist_splitter]: Splitting histograms "hists" as "split_hists"
2022-08-12 14:14:19,654 INFO [hist_comparer]: Comparing "split_hists" with rolling sum of 1 previous histogram(s).
2022-08-12 14:14:19,666 INFO [hist_profiler]: Profiling histograms "split_hists" as "profiles"
2022-08-12 14:14:19,692 INFO [hist_comparer]: Comparing "split_hists" with reference "split_hists"
2022-08-12 14:14:19,702 INFO [pull_calculator]: Comparing "comparisons" with median/mad of reference "comparisons"
2022-08-12 14:14:19,713 INFO [pull_calculator]: Comparing "profiles" with median/mad of reference "profiles"
2022-08-12 14:14:19,749 INFO [apply_func]: Computing significance of (rolling) trend in means of features
2022-08-12 14:14:19,752 INFO [compute_tl_bounds]: Calculating static bounds for "profiles"
2022-08-12 14:14:19,795 INFO [compute_tl_bounds]: Calculating static bounds for "comparisons"
2022-08-12 14:14:19,806 INFO [compute_tl_bounds]: Calculating traffic light alerts for "profiles"
2022-08-12 14:14:19,819 INFO [compute_tl_bounds]: Calculating traffic light alerts for "comparisons"
2022-08-12 14:14:19,825 INFO [apply_func]: Generating traffic light alerts summary.
2022-08-12 14:14:19,828 INFO [alerts_summary]: Combining alerts into artificial variable "_AGGREGATE_"
2022-08-12 14:14:19,831 INFO [report_pipelines]: Generating report "html_report".
2022-08-12 14:14:19,831 INFO [overview_section]: Generating section "Overview". skip empty plots: True
100%|████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 276.10it/s]
2022-08-12 14:14:19,842 INFO [histogram_section]: Generating section "Histograms".
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "popmon_bug.py", line 13, in <module>
example.pm_stability_report(time_axis="dt", time_width="1w")
File ".../python3.7/site-packages/popmon/pipeline/report.py", line 196, in df_stability_report
reference=reference_hists,
File ".../python3.7/site-packages/popmon/pipeline/report.py", line 71, in stability_report
result = pipeline.transform(datastore)
File ".../python3.7/site-packages/popmon/base/pipeline.py", line 69, in transform
datastore = module.transform(datastore)
File ".../python3.7/site-packages/popmon/pipeline/report_pipelines.py", line 250, in transform
return super().transform(datastore)
File ".../python3.7/site-packages/popmon/base/pipeline.py", line 69, in transform
datastore = module.transform(datastore)
File ".../python3.7/site-packages/popmon/base/module.py", line 50, in _transform
outputs = func(self, *list(inputs.values()))
File ".../python3.7/site-packages/popmon/visualization/histogram_section.py", line 141, in transform
plots = parallel(_plot_histograms, args)
File ".../python3.7/site-packages/popmon/utils.py", line 52, in parallel
func(*args) if mode == "args" else func(**args) for args in args_list
File ".../python3.7/site-packages/popmon/utils.py", line 52, in <listcomp>
func(*args) if mode == "args" else func(**args) for args in args_list
File ".../python3.7/site-packages/popmon/visualization/histogram_section.py", line 247, in _plot_histograms
hists, feature, hist_names, y_label, is_num, is_ts
File ".../python3.7/site-packages/popmon/visualization/utils.py", line 297, in plot_histogram_overlay
len(bin_edges), len(bin_values), x_label
AssertionError: bin edges (+ upper edge) and bin values have inconsistent lengths: 43 vs 41. a
The text was updated successfully, but these errors were encountered:
It seems that this might be an issues with Histogrammar.
From debugging it looks like in SparselyBin in some cases len(hist.bin_edges(low, high)) > len(hist.bin_entries(low, high))+1
Running pm_stability_error on float columns with large values triggers (in some cases) Assertion Error.
For example running following code:
Gives following output:
The text was updated successfully, but these errors were encountered: