Skip to content

Commit

Permalink
New readme example that doesn't depend on glum_benchmarks (#455)
Browse files Browse the repository at this point in the history
  • Loading branch information
tbenthompson committed Oct 11, 2021
1 parent b17a15c commit 248d811
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 19 deletions.
52 changes: 34 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,40 +23,56 @@ For more information on `glum`, including tutorials and API reference, please se

Why did we choose the name `glum`? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"

# An example: predicting car insurance claim frequency using Poisson regression.
# A classic example predicting housing prices

This example uses a public French car insurance dataset.
```python
>>> import pandas as pd
>>> import numpy as np
>>> from glum_benchmarks.problems import load_data, generate_narrow_insurance_dataset
>>> from glum_benchmarks.util import get_obj_val
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # Load the French Motor Insurance dataset
>>> dat = load_data(generate_narrow_insurance_dataset)
>>> X, y, sample_weight = dat['X'], dat['y'], dat['sample_weight']
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Model the number of claims per year as Poisson and regularize using a L1-penalty.
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='poisson',
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y, sample_weight=sample_weight)
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.331670
1 0.328841
2 0.319605
3 0.318660
4 0.318641
5 0.318641
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497

```

Expand Down
2 changes: 1 addition & 1 deletion docs/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Pull request process

- Before working on a non-trivial PR, please first discuss the change you wish to make via issue, Slack, email or any other method with the owners of this repository. This is meant to prevent spending time on a feature that will not be merged.
- Please make sure that a new feature comes with adequate tests. If these require data, please check if any of our existing test data sets fits the bill.
- Please make sure that all functions come with proper docstrings. If you do extensive work on docstrings, please check if the Sphinx documentation renders them correctly. The CI system builds it on every commit and pushes the rendered HTMLs to ``https://docs.dev.***REMOVED***/***REMOVED***/Quantco/glum/{YOUR_COMMIT}/index.html``
- Please make sure that all functions come with proper docstrings. If you do extensive work on docstrings, please check if the Sphinx documentation renders them correctly. ReadTheDocs builds on every commit to an open pull request. You can see whether the documentation has successfully built in the "checks" section of the PR. Once the build finishes, your documentation should be accessible by clicking the "details" link next to the check in the GitHub interface and will appear at a URL like: ``https://glum--###.org.readthedocs.build/en/###/`` where ``###`` is the number of your PR.
- Please make sure you have our pre-commit hooks installed.
- If you fix a bug, please consider first contributing a test that _fails_ because of the bug and then adding the fix as a separate commit, so that the CI system picks it up.
- Please add an entry to the change log and increment the version number according to the type of change. We use semantic versioning. Update the major if you break the public API. Update the minor if you add new functionality. Update the patch if you fixed a bug. All changes that have not been released are collected under the date ``UNRELEASED``.
Expand Down

0 comments on commit 248d811

Please sign in to comment.