-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yaml parameters #387
Yaml parameters #387
Conversation
Very good job @laem ! A few comments though:
Looking forward to see the implementation in openfisca-france ;-) |
At the moment, formulas are passed along with the parameter call in the formula. This idea is that only formulas can be used in conditions, and conditions could directly trigger
In my opinion trees can be avoided by using separate parameter files, VAR nodes, tags and a web view (coming next).
It's currently implemented as a totally separated process in the new file parameters.py
I haven't yet looked at the YAML files generated from the XLS files, but merging them at this level could be far easier.
This test file implements an example of using parameters in formulas on real -france use cases. |
About your call to the parameter plafond_securite_sociale please note that a legislative is not always linked to a tax benefit system variable. |
Do you mean that parameters are not necessarily tied to a unique variable ? In this case, no worries, you can of course get whatever parameter you want by name. But in most of the cases in my experience, they are, so this shortcut helps reduce useless code. |
2015-01-01: 3170 | ||
# Asking for a parameter for the date before the last will | ||
# raise a ParameterNotFound error | ||
# TODO: Or should it return 0 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or return the next known value ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course it's a possibility
# Values should be written in descending ordered : | ||
# what's most important today first | ||
VALUES: | ||
# Fuzzy is explicit : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This looks very promising, thanks @laem for this work ! I'm gathering my thoughts and I'll come back ;). |
description: SMIC horaire brut | ||
format: float | ||
type: monetary | ||
VALUES: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's with the casing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just me transcribing caps locked XML... and I was thinking that the transformation type (here just VALUES, there BAREME and LIN-whatever) helps understand what the parameter is about quicker.
There is definitely a lot of good things in this. Here are my thoughts and interrogations :
To take a simple example: class agffc(Variable):
label = u"Cotisation retraite AGFF"
def formula(self, simulation, period):
base = simulation.calculate('salaire', period)
pmss = simulation.getParameter('pmss', instant) Instead of : return period, self.getParameter(
instant,
base_options={'base': base, 'factor': pmss}
) I'd prefer to have something like : return period, MarginalRateBaremeTemplate(
bareme = simulation.getParameter('agffc') # or self.getParameter() if you want to have it transparent.
instant = period.start,
base_options={'base': base, 'factor': pmss}
) where The The specific behaviour would be in its own place, more discoverable and explicit. And it would be easier to create new templates (no need to modify a core function). The parameters would be roughly the same than in what you suggest, but maybe with less details (it would juste define a |
Yes ! We have two principles for writing formulas : Ad-hoc code Reusable code b) - python classes are not used anymore for reusable computations that can reasonably be expressed in YAML (or whatever declarative), which will behind the scenes call the necessary simulation.calculate. My examples here are halfway between a) and b), but I'm convinced we should explore the latter. The dramatic drawback of approach a) to me is that your formula parameters are always only half of the story : "Here are some numbers related to |
The risk I see with declarative formulas, totally out of the python code, is that it will work for some easy cases (like cotisations), but clearly we won't be able to do it for more complicated formulas. So we'll end up having two codebases: one in yaml, another in python. I'm sure we'll still be able to run products like mes-aides or embauche, confined to their own respective domain. But for developers, it will be hard to understand these two different world. They will probably work in only one of them, and not be able to do much on the other one. And thus we will be giving up the idea of a single product to make the French tax and benefit system readable and executable. |
Two files : - parameters.yaml - variables_parameters.yaml
A value can be retrieved. But it should be a vector and it should be easier
Name can be omitted if = variable name. Ignore collections for now. Adapt direct parameters.get tests.
And test it.
Linear is an adjective whereas bareme is a noun, but linear_transform isn't any better...
Turn them to english. Use the precise marginalRateTaxScale name, the others are not implemented for now
The rate is variable along the population dimension
Are there any news on this PR ? I'm pretty impatient about the yaml parameters ;) If this is a big piece of work, is there a way we could split it and merge some of it, in a lean approach, to start benefiting from the enhancement ? |
I'm investigating more expressive YAML syntax for writing variables and a UI to explore them in a different repo. What's missing ?
|
No changes for over 8 months, and still “this PR is not ready (review the code at your own risk 😁 ), but exists to expose and discuss the new functionality” and CI doesn't pass. This seems very monolithic and, even though there was enthusiasm on the switch to YAML, it looks like there was no actionable consensus reached. In order to increase the readability of the current active work in progress, I close this PR since there is no active development on it. |
The problem and possible solutions were exposed here. In short : how to throw preprocessing.py away and make the prélèvements sociaux parameters more readable.
Following is a list of proposed features.
YAML parameters
No hierarchy
You can no longer write hierarchical parameters. Hierarchy is heavily used in the prélèvements sociaux domain in openfisca-france--leading to an awful preprocessing.py, hindering programmatic and non-initiated human intervention.
The new VAR node type can replace hierarchy in some cases.
Conditions in parameters (VAR)
Lots of formulas are just operating a switch on some trivial condition towards alternative parameters, e.g. :
This is confusing, it introduces linking-names that are just rules stringified, and prevents contributors from modifying parameters with certainty without diving into python-numpy code.
The proposed syntax is :
Condition nodes are just an eval of numpy arrays, and should probably be constrained to a set of possibilities (using regexps ?) to inform the parameter contributor about what's allowed. Nevertheless, I favor good documentation over raising errors to inform.
See
variable_parameters.yaml
for more examples.New parameter models
The prélèvements sociaux parameters are making a heavy use of
BAREME
parameters, even for simpler "cotisations" that only have one tranche.Hence the new
LIN
parameter model, implementing arate*base
transformation, possibly with a threshold (it really is a shortcut for a BAREME with a single tranche).Short date notation
XML
YAML
Short YAML
Extensions
YAML parameters (especially without hierarchy) should make it easier to use extensions (no NODE.children pollution). @cbenz can you confirm ?
Next steps
The verification of parameters in the parsing step is also way less developed than currently.
And all the TODOs in the code.
Questions
LIN
, and ad-hoc code. It looks to me like an inevitable and virtuous path for it could enable non-coders to read and update these simpler formulas.