Skip to content

Commit

Permalink
Merge branch 'develop' into merge-master
Browse files Browse the repository at this point in the history
  • Loading branch information
sbrugman committed Dec 7, 2020
2 parents dd707f4 + 6424d27 commit c3d038d
Show file tree
Hide file tree
Showing 17 changed files with 519 additions and 136 deletions.
36 changes: 35 additions & 1 deletion README.rst
Expand Up @@ -2,7 +2,7 @@
Population Shift Monitoring
===========================

|build| |docs| |release| |release_date|
|build| |docs| |release| |release_date| |downloads|

|logo|

Expand Down Expand Up @@ -128,6 +128,37 @@ These examples also work with spark dataframes.
You can see the output of such example notebook code `here <https://crclz.com/popmon/reports/test_data_report.html>`_.
For all available examples, please see the `tutorials <https://popmon.readthedocs.io/en/latest/tutorials.html>`_ at read-the-docs.

Resources
=========

Presentations
-------------

+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Title | Host | Date | Speaker |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Popmon - population monitoring made easy | `Data Lunch @ Eneco <https://www.eneco.nl/>`_ | October 29, 2020 | Max Baak, Simon Brugman |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Popmon - population monitoring made easy | `Data Science Summit 2020 <https://dssconf.pl/en/>`_ | October 16, 2020 | Max Baak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| `Population Shift Monitoring Made Easy: the popmon package <https://youtu.be/PgaQpxzT_0g>`_ | `Online Data Science Meetup @ ING WBAA <https://www.meetup.com/nl-NL/Tech-Meetups-ING/events/>`_ | July 8 2020 | Tomas Sostak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| `Popmon: Population Shift Monitoring Made Easy <https://www.youtube.com/watch?v=HE-3YeVYqPY>`_ | `PyData Fest Amsterdam 2020 <https://amsterdam.pydata.org/>`_ | June 16, 2020 | Tomas Sostak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Popmon: Population Shift Monitoring Made Easy | `Amundsen Community Meetup <https://github.com/amundsen-io/amundsen>`_ | June 4, 2020 | Max Baak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+


Articles
--------

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------+----------------+
| Title | Date | Author |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------+----------------+
| `Popmon Open Source Package — Population Shift Monitoring Made Easy <https://medium.com/wbaa/population-monitoring-open-source-1ce3139d8c3a>`_ | May 20, 2020 | Nicole Mpozika |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------+----------------+


Project contributors
====================

Expand Down Expand Up @@ -171,3 +202,6 @@ Copyright ING WBAA. `popmon` is completely free, open-source and licensed under
.. |notebook_incremental_data_colab| image:: https://colab.research.google.com/assets/colab-badge.svg
:alt: Open in Colab
:target: https://colab.research.google.com/github/ing-bank/popmon/blob/master/popmon/notebooks/popmon_tutorial_incremental_data.ipynb
.. |downloads| image:: https://pepy.tech/badge/popmon
:alt: PyPi downloads
:target: https://pepy.tech/project/popmon
8 changes: 4 additions & 4 deletions popmon/alerting/alerts_summary.py
Expand Up @@ -44,7 +44,7 @@ def __init__(
:param str read_key: key of input data to read from datastore.
:param str store_key: key of output data to store in datastore (optional).
:param str combined_variable: name of artifical variable that combines all alerts. default is '_AGGREGATE_'.
:param str combined_variable: name of artificial variable that combines all alerts. default is '_AGGREGATE_'.
:param list features: features of data frames to pick up from input data (optional).
:param list ignore_features: list of features to ignore (optional).
"""
Expand Down Expand Up @@ -77,7 +77,7 @@ def transform(self, datastore):
df = (self.get_datastore_object(data, feature, dtype=pd.DataFrame)).copy(
deep=False
)
df.columns = [feature + "_" + c for c in df.columns]
df.columns = [f"{feature}_{c}" for c in df.columns]
df_list.append(df)

# the different features could technically have different indices.
Expand All @@ -99,8 +99,8 @@ def transform(self, datastore):
dfc["worst"] = tlv[cols].values.max(axis=1) if len(cols) else 0
# colors of traffic lights
for color in ["green", "yellow", "red"]:
cols = fnmatch.filter(tlv.columns, "*_n_{}".format(color))
dfc["n_{}".format(color)] = tlv[cols].values.sum(axis=1) if len(cols) else 0
cols = fnmatch.filter(tlv.columns, f"*_n_{color}")
dfc[f"n_{color}"] = tlv[cols].values.sum(axis=1) if len(cols) else 0

# store combination of traffic alerts
data[self.combined_variable] = dfc
Expand Down
2 changes: 1 addition & 1 deletion popmon/hist/histogram.py
Expand Up @@ -211,7 +211,7 @@ def __repr__(self):
return f"HistogramContainer(dtype={self.npdtype}, n_dims={self.n_dim})"

def __str__(self):
return str(self)
return repr(self)

def _edit_name(self, axis_name, xname, yname, convert_time_index, short_keys):
if convert_time_index and self.is_ts:
Expand Down
60 changes: 33 additions & 27 deletions popmon/notebooks/popmon_tutorial_advanced.ipynb
Expand Up @@ -4,7 +4,6 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
Expand All @@ -26,10 +25,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"# install popmon (if not installed yet)\n",
"import sys\n",
"\n",
"!{sys.executable} -m pip install popmon"
"!\"{sys.executable}\" -m pip install popmon"
]
},
{
Expand Down Expand Up @@ -145,11 +145,13 @@
"outputs": [],
"source": [
"# download histogrammar jar files if not already installed, used for histogramming of spark dataframe\n",
"from pyspark.sql import SparkSession\n",
"try:\n",
" from pyspark.sql import SparkSession\n",
"\n",
"spark = SparkSession.builder.config(\n",
" \"spark.jars.packages\", \"org.diana-hep:histogrammar-sparksql_2.11:1.0.4\"\n",
").getOrCreate()"
" pyspark_installed = True\n",
"except ImportError:\n",
" print(\"pyspark needs to be installed for this example\")\n",
" pyspark_installed = False"
]
},
{
Expand All @@ -158,18 +160,19 @@
"metadata": {},
"outputs": [],
"source": [
"sdf = spark.createDataFrame(df)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sdf.pm_stability_report(\n",
" time_axis=\"DATE\", time_width=\"1w\", time_offset=\"2015-07-02\", extended_report=False\n",
")"
"if pyspark_installed:\n",
" spark = SparkSession.builder.config(\n",
" \"spark.jars.packages\", \"org.diana-hep:histogrammar-sparksql_2.11:1.0.4\"\n",
" ).getOrCreate()\n",
"\n",
" sdf = spark.createDataFrame(df)\n",
"\n",
" sdf.pm_stability_report(\n",
" time_axis=\"DATE\",\n",
" time_width=\"1w\",\n",
" time_offset=\"2015-07-02\",\n",
" extended_report=False,\n",
" )"
]
},
{
Expand Down Expand Up @@ -287,7 +290,7 @@
"outputs": [],
"source": [
"split_hist = split_hists.query(\"date == '2015-07-05 12:00:00'\")\n",
"split_hist.histogram[0].hist.plot.matplotlib();"
"split_hist.histogram[0].hist.plot.matplotlib()"
]
},
{
Expand All @@ -303,7 +306,7 @@
"metadata": {},
"outputs": [],
"source": [
"split_hist.histogram_ref[0].hist.plot.matplotlib();"
"split_hist.histogram_ref[0].hist.plot.matplotlib()"
]
},
{
Expand All @@ -320,11 +323,14 @@
"metadata": {},
"outputs": [],
"source": [
"import pickle\n",
"# As HTML report\n",
"report.to_file(\"report.html\")\n",
"\n",
"with open(\"report.pkl\", \"wb\") as f:\n",
" pickle.dump(report, f)\n",
"report.to_file(\"report.html\")"
"# Alternatively, as serialized Python object\n",
"# import pickle\n",
"\n",
"# with open(\"report.pkl\", \"wb\") as f:\n",
"# pickle.dump(report, f)"
]
},
{
Expand Down Expand Up @@ -473,18 +479,18 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
"version": "3.8.6"
},
"nteract": {
"version": "0.15.0"
},
"pycharm": {
"stem_cell": {
"cell_type": "raw",
"source": [],
"metadata": {
"collapsed": false
}
},
"source": []
}
}
},
Expand Down
4 changes: 2 additions & 2 deletions popmon/notebooks/popmon_tutorial_basic.ipynb
Expand Up @@ -4,7 +4,6 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
Expand Down Expand Up @@ -36,10 +35,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"# install popmon (if not installed yet)\n",
"import sys\n",
"\n",
"!{sys.executable} -m pip install popmon"
"!\"{sys.executable}\" -m pip install popmon"
]
},
{
Expand Down
3 changes: 2 additions & 1 deletion popmon/notebooks/popmon_tutorial_incremental_data.ipynb
Expand Up @@ -28,10 +28,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"# install popmon (if not installed yet)\n",
"import sys\n",
"\n",
"!{sys.executable} -m pip install popmon"
"!\"{sys.executable}\" -m pip install popmon"
]
},
{
Expand Down
47 changes: 24 additions & 23 deletions popmon/pipeline/report_pipelines.py
Expand Up @@ -18,7 +18,7 @@
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


from pathlib import PosixPath
from pathlib import Path

from ..base import Pipeline
from ..config import config
Expand All @@ -30,6 +30,7 @@
metrics_self_reference,
)
from ..visualization import (
AlertSectionGenerator,
HistogramSection,
ReportGenerator,
SectionGenerator,
Expand All @@ -46,7 +47,7 @@ def self_reference(
features=None,
skip_empty_plots=True,
last_n=0,
plot_hist_n=2,
plot_hist_n=6,
report_filepath=None,
show_stats=None,
**kwargs,
Expand Down Expand Up @@ -160,7 +161,7 @@ def rolling_reference(
features=None,
skip_empty_plots=True,
last_n=0,
plot_hist_n=2,
plot_hist_n=6,
report_filepath=None,
show_stats=None,
**kwargs,
Expand Down Expand Up @@ -218,7 +219,7 @@ def expanding_reference(
features=None,
skip_empty_plots=True,
last_n=0,
plot_hist_n=2,
plot_hist_n=6,
report_filepath=None,
show_stats=None,
**kwargs,
Expand Down Expand Up @@ -284,7 +285,7 @@ def __init__(
last_n=0,
skip_first_n=0,
skip_last_n=0,
plot_hist_n=2,
plot_hist_n=6,
):
"""Initialize an instance of Report.
Expand Down Expand Up @@ -329,35 +330,35 @@ def sg_kws(read_key):
# - a section showing all traffic light alerts of monitored statistics
# - a section with a summary of traffic light alerts
# --- o generate report
SectionGenerator(
dynamic_bounds="dynamic_bounds",
section_name=profiles_section,
static_bounds="static_bounds",
ignore_stat_endswith=["_mean", "_std", "_pull"],
**sg_kws("profiles"),
HistogramSection(
read_key="split_hists",
store_key=sections_key,
section_name=histograms_section,
hist_name_starts_with="histogram",
last_n=plot_hist_n,
description=descs.get("histograms", ""),
),
TrafficLightSectionGenerator(
section_name=traffic_lights_section, **sg_kws("traffic_lights")
),
AlertSectionGenerator(section_name=alerts_section, **sg_kws("alerts")),
SectionGenerator(
dynamic_bounds="dynamic_bounds_comparisons",
static_bounds="static_bounds_comparisons",
section_name=comparisons_section,
ignore_stat_endswith=["_mean", "_std", "_pull"],
**sg_kws("comparisons"),
),
TrafficLightSectionGenerator(
section_name=traffic_lights_section, **sg_kws("traffic_lights")
),
SectionGenerator(section_name=alerts_section, **sg_kws("alerts")),
HistogramSection(
read_key="split_hists",
store_key=sections_key,
section_name=histograms_section,
hist_name_starts_with="histogram",
last_n=plot_hist_n,
description=descs.get("histograms", ""),
SectionGenerator(
dynamic_bounds="dynamic_bounds",
section_name=profiles_section,
static_bounds="static_bounds",
ignore_stat_endswith=["_mean", "_std", "_pull"],
**sg_kws("profiles"),
),
ReportGenerator(read_key=sections_key, store_key=store_key),
]
if isinstance(report_filepath, (str, PosixPath)) and len(report_filepath) > 0:
if isinstance(report_filepath, (str, Path)) and len(report_filepath) > 0:
self.modules.append(FileWriter(store_key, file_path=report_filepath))

def transform(self, datastore):
Expand Down
4 changes: 2 additions & 2 deletions popmon/version.py
@@ -1,6 +1,6 @@
"""THIS FILE IS AUTO-GENERATED BY SETUP.PY."""

name = "popmon"
version = "0.3.10"
full_version = "0.3.10"
version = "0.3.11"
full_version = "0.3.11"
release = True
4 changes: 3 additions & 1 deletion popmon/visualization/__init__.py
Expand Up @@ -20,14 +20,15 @@

# flake8: noqa

from popmon.visualization.alert_section_generator import AlertSectionGenerator
from popmon.visualization.histogram_section import HistogramSection
from popmon.visualization.report_generator import ReportGenerator
from popmon.visualization.section_generator import SectionGenerator
from popmon.visualization.traffic_light_section_generator import (
TrafficLightSectionGenerator,
)

# set matplotlib backend to batchmode when running in shell
# set matplotlib backend to batch mode when running in shell
# need to do this *before* matplotlib.pyplot gets imported
from ..visualization.backend import set_matplotlib_backend

Expand All @@ -39,4 +40,5 @@
"HistogramSection",
"ReportGenerator",
"TrafficLightSectionGenerator",
"AlertSectionGenerator",
]

0 comments on commit c3d038d

Please sign in to comment.