Releases: jiro-iwanaga/rfscorer
[0.4.5] - 2026-06-17
Added
- Spearman correlation diagnostic attributes, populated automatically by
fit():recency_corr_/frequency_corr_: equal-weight Spearman correlation between the RF rank
and the empirical probability. Negative recency / positive frequency correlation indicates
the expected monotonic relationship.recency_corr_weighted_/frequency_corr_weighted_: sample-size weighted variants.recency_corr_pvalue_/frequency_corr_pvalue_: two-sided p-values for the equal-weight
correlations.recency_slice_corr_/frequency_slice_corr_: dicts of per-slice correlations for
diagnosing 2D monotonicity (e.g. recency-vs-probability within each frequency level).
verboseparameter tooptimize()(defaultFalse): whenTrue, prints solver progress and
result summary.pathparameter toplot_probability_surface()andplot_marginal_probability(): save the
figure directly. A directory path writes a default filename
(surface_{kind}_probability.png/marginal_{kind}_probability.png); a file path writes that
name. Both methods now also show a default title based onkindwhentitle=None.- Practical tutorial notebooks (
examples/tutorial_practical_ja.ipynb/
tutorial_practical_en.ipynb): user-level train/test split, building all nine models,
accuracy comparison, and model save/load (pickle and zip archive).
Changed
show()redesigned as a structured diagnostic report (data statistics, fit parameters,
Spearman correlations, and the empirical probability table) instead of the previous
profiling-style output.plot_marginal_probability()API redesign. Breaking changes:kindvalues changed from("emp", "er", "ef", "mr", "mf", "all")to
("er", "ef", "mr", "mf", "rboth", "fboth"). The"all"overlay is split into"rboth"
(empirical + monotonic recency) and"fboth"(empirical + monotonic frequency);"emp"is
removed.- The
axisparameter is removed (the axis is now inferred fromkind). - The separate
recency_label/frequency_labelparameters are consolidated into a single
axis_label=None.
export_probability_csv(): the default output filename changed from{kind}_probability.csv
toprobability_{kind}.csv(e.g.probability_emp.csv). Breaking only for callers relying
on the auto-generated name when passingpath=Noneor a directory; explicit file paths are
unaffected. (The CSV names inside thesave_zip()archive are unchanged.)- Internal aggregation dicts renamed to private:
R/F/RF2N/RF2CV/RF2Prob/R2N/R2CV/R2Prob/F2N/F2CV/F2Prob→_R/_F/_RF2N/…. These were
implementation details, not public API; the public probability attributes
(emp_probability_dict_,er_probability_dict_, etc.) are unchanged. - Documentation overhaul across
docs/: document titles renamed (アーキテクチャ構成書 /
機能仕様書 / リポジトリ構成 / 用語集),glossary.mdrestructured (基本概念・期間とデータ分割・
アルゴリズム・API 簡潔版) with terminology unification (推薦スコア=商品選択確率, 対象イベント,
behavior history), and a release procedure added todevelopment-guidelines.md. - Beginner tutorial notebooks (
tutorial_beginner_ja.ipynb/tutorial_beginner_en.ipynb)
updated for the revised API and terminology. - Test suite expanded (+30 cases; 439 passing) covering transform with 2D optimized kinds,
plotpathsaving, objective-function fit quality (analytic optima), datetime64 splits, and
version-mismatch semantics forload()/load_zip().
Removed
recency_probability_andfrequency_probability_attributes; consolidated into
er_probability_/ef_probability_.axisparameter, and the"emp"/"all"kind values, fromplot_marginal_probability().
Fixed
- Dependency floor:
cvxpy>=1.3→cvxpy>=1.5. The optimizer explicitly solves with the
CLARABEL solver, which is bundled with cvxpy since 1.4 and the default since 1.5; the previous
>=1.3floor could resolve an environment without CLARABEL available.
[0.4.4] - 2026-06-15
Added
save(path=None)/load(path): persist a fitted model to a pickle file and restore it
without retraining.path=Nonesavesrfscorer.pklto the current directory; a directory
path savesrfscorer.pklinside it; a file path saves directly. On major or minor version
mismatch,load()emits aUserWarningand continues loading.save_zip(path=None)/load_zip(path): save/restore the model as a zip archive bundling
rfscorer.pkl,metadata.json(version, parameters, fit statistics), probability-table CSVs,
and plot PNGs for all computed model kinds.path=Nonesavesscorer.zipto the current
directory. Intended for research sharing and artifact management.- Tutorial notebooks (
tutorial_beginner_en.ipynb/tutorial_beginner_ja.ipynb): added
Section 10 coveringsave()/load()usage with a Google Colab persistence guide.
Also added a commented# !pip install rfscorerline to the import cell.
Changed
- Terminology unification: renamed all
eval-prefixed names togt(ground truth)
to align with the unified terminology indocs/glossary.md(正解データ / ground truth data).
Breaking changes:fit(df_obs, df_eval, ...)→fit(df_obs, df_gt, ...)evaluate(df_rec, df_eval, ...)→evaluate(df_rec, df_gt, ...)split_by_date(..., evaluation_days=7, ...)→split_by_date(..., gt_days=7, ...);
return value documented as(df_obs, df_gt)instead of(df_obs, df_eval).- Attribute
record_num_eval→record_num_gt - Attribute
evaluation_start_→gt_start_ - Attribute
evaluation_end_→gt_end_ show()output labelevaluation:→ground_truth:- Error message "No events observed in evaluation period" → "No events observed in ground truth period"
fit(): thedatetimecolumn indf_gtis now optional. Onlyuseranditemcolumns
are required for fitting. Thegt_start_andgt_end_attributes (which depended on
df_gt's datetime column) have been removed.
[0.4.3] - 2026-06-15
- Tutorial notebooks: Created bilingual beginner tutorials
examples/tutorial_beginner_en.ipynb: English version translated from Japanese tutorial
using terminology fromdocs/glossary.md(interaction log, observation log, ground truth log, etc.)- Covers complete workflow: data loading, splitting, model building (emp/mono/mcc),
probability visualization, scoring, and evaluation. - Updated README.md example references to point to the new tutorial notebooks
(examples/tutorial_beginner_en.ipynbfor English section,
examples/tutorial_beginner_ja.ipynbfor Japanese section).
[0.4.2] - 2026-06-14
Changed
- Documentation: Comprehensive README.md improvements including:
- Added Citation section with in-text academic citation templates and BibTeX references
for citing the package in research papers. - Added Minimal Example demonstrating end-to-end workflow with
split_by_date(),
fit(),optimize(),visualize(),transform(), andevaluate()methods. - Improved Visualization section with side-by-side comparison of three representative
optimization methods (Empirical, Monotone, Monotonicity-Convex-Concave) using horizontal layout. - Resized visualization images for optimal display in documentation.
- Simplified English language throughout Features, Usage, and method descriptions for clarity.
- Created comprehensive Japanese README (
# RFScorer (日本語README)) with complete
translation of all explanatory text while preserving code examples and diagrams exactly. - Aligned English and Japanese versions to ensure consistent technical terminology
(product-choice probabilities, optimization methods, feature descriptions).
- Added Citation section with in-text academic citation templates and BibTeX references
[0.4.1] - 2026-06-13
Fixed
split_by_date():observation_days=Nnow produces an N-unit observation window
[target_date - N + 1, target_date], restoring symmetry withevaluation_days=N
(which produces the N-unit window[target_date + 1, target_date + N]).
Previouslyobservation_days=Nproduced anN+1-unit window
[target_date - N, target_date]due to an off-by-one in the inclusive start
boundary. Migration: if you previously called
split_by_date(df, target_date, observation_days=N)and want the same
observation window, passobservation_days=N+1.normalize_ref(): invalid string dates (e.g.,"not a date") now consistently
raiseValueError("time value could not be normalized: ..."). Previously the
str-path bypassed the friendly error and surfaced a raw pandas error.- Documentation: numerous accuracy fixes across
docs/(glossary, functional-design,
product-requirements, architecture, repository-structure, development-guidelines)
including terminology unification (閲覧, 累積対象イベント発生数),_plotting.py
reference in the module / repository layout, Python 3.11 minimum requirement,
andkindenum corrections forplot_probability_surface()/
plot_marginal_probability().
Changed
- Internal refactor: extracted
plot_probability_surface()and
plot_marginal_probability()to a new private modulesrc/rfscorer/_plotting.py
asPlottingMixin.RecencyFrequencyScorernow inherits fromPlottingMixin,
so the public API (scorer.plot_*()) is preserved with no caller-visible change. - Internal refactor: reorganized
RecencyFrequencyScorermethods by typical
workflow (Initialization → Fitting → Optimization → Inference → Evaluation →
Export → Inspection → Internal helpers) and added section divider comments.
No behavior change.
[0.4.0] - 2026-06-13
Added
split_by_date(df, target_date, observation_days=28, evaluation_days=7, time_col="datetime"):
new top-level utility function (from rfscorer import split_by_date) that splits a single
interaction log into an observation/evaluation pair attarget_date.
Returns(df_obs, df_eval). Accepts the same datetime or integertime_colas the scorer.unitparameter toRecencyFrequencyScorer.__init__(): controls recency bin granularity.
unit=7gives weekly recency,unit=30approximate monthly. Defaultunit=1preserves
the previous day-level behavior.- Integer
time_colsupport:time_colcolumns of integer dtype are now accepted in addition
to datetime / string columns acrossfit(),transform(), andsplit_by_date(). plot_marginal_probability(kind="er")andkind="ef": new 1-D marginal plot support for the
empirical recency and frequency models.
Changed
er_probability_andef_probability_are now true 1-D outputs, mirroring the earlier
mr/mfrefactor. Breaking changes:er_probability_: columns reduced to(recency, probability)
(previouslyrecency, frequency, probabilityafter 2-D broadcast)ef_probability_: columns reduced to(frequency, probability)
(previouslyrecency, frequency, probabilityafter 2-D broadcast)er_probability_dict_: keys changed from(r, f)tuple toint ref_probability_dict_: keys changed from(r, f)tuple toint fpredict(kind="er"):fargument is now ignored;ris clamped torecency_limitpredict(kind="ef"):rargument is now ignored;fis clamped tofrequency_limitplot_probability_surface(kind="er"|"ef"): now raisesValueError
(useplot_marginal_probability()instead)
empirical_probability_*attributes renamed toemp_probability_*for consistency with all
other short-form kind prefixes. Breaking changes:empirical_probability_→emp_probability_empirical_probability_table_→emp_probability_table_empirical_probability_dict_→emp_probability_dict_- CSV column
"empirical_probability"(fromexport_probability_csv(kind="all")) →"emp_probability" - The kind aliases
"empirical","empirical_recency","empirical_frequency"are preserved.
Removed
- Python 3.10 support. Minimum supported version is now Python 3.11.
er_probability_table_andef_probability_table_attributes.
These were 2-D broadcast grids produced by the previous implementation and are no longer generated.
[0.3.2] - 2026-06-11
Changed
optimize(kind='mr')andoptimize(kind='mf')no longer broadcast results to the full RF grid.
Results are now stored as true 1-D outputs:mr_probability_: DataFrame with columnsrecency, probability
(previouslyrecency, frequency, probabilityafter broadcast)mf_probability_: DataFrame with columnsfrequency, probability
(previouslyrecency, frequency, probabilityafter broadcast)mr_probability_dict_: keyed by recency rankr(int)
(previously keyed by(r, f)tuple)mf_probability_dict_: keyed by frequencyf(int)
(previously keyed by(r, f)tuple)
plot_probability_surface()now raisesValueErrorwhenkind='mr'orkind='mf'is specified,
as 1-D models cannot be represented as a surface plot.
Removed
mr_probability_table_andmf_probability_table_attributes.
These were 2-D broadcast grids produced by the previous implementation and are no longer generated.
[0.3.1] - 2026-06-10
Fixed
examples/basic_usage.ipynb: correctedtransform()call to use a pre-filtered observation
window (df_test_obs) instead of the full test log, matching the documented API contract.README.md: rewrote the Usage section to reflect the current API —fit()now takes
pre-splitdf_obs/df_evalDataFrames, andtransform()requires a pre-filtered
observation window. Addedplot_probability_surface()commands alongside each surface image.
Added
- Tests for
optimize()kind aliases (monotonic,monotonic_recency, etc.) and
export_probability_csv().
[0.3.0] - 2026-06-07
Added
epsparameter tooptimize()andRFOptimizer.build_model()for strict monotonicity.
Wheneps > 0, adjacent recency/frequency probability values are forced to differ by at least
eps, preventing ties. Defaulteps=0.0preserves the existing weak monotonicity behavior.
Applies to allkindvalues (mono,mr,mf,mrc,mfc,mcc).- Automatic upper-bound validation for
eps: raisesValueErrorifepsexceeds
p_max / (n - 1)(wherep_maxis the empirical probability maximum andnis the number
of recency or frequency levels), ensuring the problem remains feasible.
Changed
-
Kind aliases renamed from
monotone_*tomonotonic_*for consistent mathematical terminology.Old alias New alias Canonical monotonemonotonicmonomonotone_recencymonotonic_recencymrmonotone_frequencymonotonic_frequencymfmonotone_recency_convexmonotonic_recency_convexmrcmonotone_frequency_concavemonotonic_frequency_concavemfcmonotone_convex_concavemonotonic_convex_concavemcc
[0.2.8] - 2026-06-07
Added
-
optimize(kind='mr'): new 1-D optimization model for the recency axis.
Enforces monotone decreasing + convex constraints on the marginal recency probability R2Prob,
then broadcasts the result across all frequency values. -
optimize(kind='mf'): new 1-D optimization model for the frequency axis.
Enforces monotone increasing + concave constraints on the marginal frequency probability F2Prob,
then broadcasts the result across all recency values. -
ermodel: empirical recency marginal probability (R2Prob) broadcast to the full RF grid.
Computed automatically insidefit()/fit_period(); no extra call needed. -
efmodel: empirical frequency marginal probability (F2Prob) broadcast to the full RF grid.
Computed automatically insidefit()/fit_period(); no extra call needed. -
Corresponding attributes populated by
optimize(kind='mr'):
mr_probability_,mr_probability_table_,mr_probability_dict_ -
Corresponding attributes populated by
optimize(kind='mf'):
mf_probability_,mf_probability_table_,mf_probability_dict_ -
Corresponding attributes populated by
fit()/fit_period():
er_probability_,er_probability_table_,er_probability_dict_,
ef_probability_,ef_probability_table_,ef_probability_dict_ -
Kind alias system: long descriptive names are accepted everywhere and normalized to their
canonical short forms via_normalize_kind().Alias Canonical empiricalempempirical_recencyerempirical_frequencyefmonotonemonomonotone_recencymrmonotone_frequencymfmonotone_recency_convexmrcmonotone_frequency_concavemfcmonotone_convex_concavemcc -
plot_marginal_probability()now accepts akindparameter ("emp","mr","mf","all").
kind="all"overlays the empirical and optimized 1-D series on the same axes
(solid line foremp, dashed line formr/mf).
Changed
- Internal canonical kind name changed from
"empirical"to"emp"for consistency with all other
short-form kind names (mono,mr,mf,mrc,mfc,mcc).
The string"empirical"continues to work as an alias. plot_marginal_probability(): replacedxlabelparameter withrecency_label/frequency_label
to match the naming convention ofplot_probability_surface().img/surface_empirical_probability.pngrenamed toimg/surface_emp_probability.png.export_probability_csv(kind='all')now outputs all nine models:
emp,er,ef,mono,mr,mf,mrc,mfc,mcc.