Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespacing variables #668

Closed
verbman opened this issue May 19, 2018 · 14 comments
Closed

Namespacing variables #668

verbman opened this issue May 19, 2018 · 14 comments

Comments

@verbman
Copy link

verbman commented May 19, 2018

Hi there!

Here is what I did:

I helped with a hackathon for OpenFisca in New Zealand

Here is what I expected to happen:

Variables to have namespaces (like parameters) so everything appeared orderly

Here is what actually happened:

Variables did not have namespaces and it was confusing to explain. I appreciate this would break France but New Zealand does not care.

I identify more as a:

  • Developer
@MattiSG
Copy link
Member

MattiSG commented May 19, 2018

I appreciate this would break France but New Zealand does not care.

I take full responsibility for that encouraged sassiness 😂

To rephrase in a very concrete way: why, when I put a variable file under directories, OpenFisca Core doesn't prepend the directory hierarchy like it does for parameters?

@benjello
Copy link
Member

@verbman @MattiSG : It is a feature not a bug ;-)
Some economists users use openfisca as an interactive tool to do some computations on test case or survey data.
The frequently type down the name of the variable to explore. It would be very painful to type the entire name every time you need some specific variable which is unambiguous.
The use of parameters is less more frequent and they are a lot more ambiguities.
But we tried to organise the variables by thematic folder to help the developers.
I wont be against using namespaces but it should be possible to call a variable by its final name if there are no ambiguities to keep it as simple as possible for the user.

@MattiSG
Copy link
Member

MattiSG commented May 21, 2018

Thanks a lot @benjello for your insight!

it should be possible to call a variable by its final name if there are no ambiguities

That would probably not work as expected, as the whole point of namespacing is removing any ambiguities while still allowing shorter “last names”.

The usual way this balance is struck in imports in programming languages is to have some sort of “import”.

Do you believe the population you described (or any other user, really, because I think all would have an annoying time typing fully-qualified names all the time!) would be happy with a solution where you would do an equivalent of importing a namespace? In such a way that you could either reference a variable (or parameter) through its full.qualified.namespace.which.could.be.very_long or import very_long from full.qualified.namespace.which.could.be and then reference only very_long 🙂

@benjello
Copy link
Member

@MattiSG : I think it won't work because there are much more unambiguous short last names than ambiguous ones for variables (it is the opposite for parameters) and the import strategy won't lighten the burden because:

  • it adds a layer of coding complexity
  • you still have the burden to look for the long fully qualified name at least once

May be it is possible to make happy the fully qualified name lovers by using
full.qualified.namespace.which.could.be.very_long.__name__ ?

My preference, based on experience and observation, goes to using long lastnames when needed using _ and trying to formalize that.

@MattiSG
Copy link
Member

MattiSG commented May 22, 2018

fully qualified name lovers

I guess it is not so much about loving long names than it is about the treatment between variables and parameters not being consistent.

Interesting point with the idea that variable names conflicts are much less likely to happen than parameter names conflicts.

So maybe supporting both fully qualified and short names, throwing a warning when a short name is defined, and an error when a short name is used when it is ambiguous?

@benjello
Copy link
Member

@MattiSG: ni irony intended, i am a fully qualified name lover (and consistency lover) ;-)

In practice it is vert painful, and more so in french when the words are long (italians may suffer too).

But I support tour solution which seems to be in line with what I suggested. I do not understand when the warning will ne thrown though ?

@MattiSG
Copy link
Member

MattiSG commented May 23, 2018

ni irony intended

No worries, sorry if my message sounded offended, I totally appreciate the qualification! 😛

when the warning will ne thrown though ?

Good point. I guess I had in mind it would be when it is added to the legislation, but that cannot be detected… What would make sense is to have a warning when the tax and benefit system is loaded, but then I am not certain it is very wise: either OpenFisca supports conflicting varnames and just tells you when it is ambiguous, or it doesn't and it can shout at you as soon as you define them.

Let's go back to the use case.

@verbman could you maybe provide an example where you would have liked to use a fully qualified namespace, how associated parameters were referenced, and how namespace support would have renamed the variable? 🙂
@fpagnoux @guillett do you have examples of variables you had to give longer names to than would have been necessary if there was namespacing support?

@fpagnoux
Copy link
Member

fpagnoux commented May 23, 2018

@fpagnoux @guillett do you have examples of variables you had to give longer names to than would have been necessary if there was namespacing support?

Yes: aide_logement_base_ressources_eval_forfaitaire could be prestations.aide_logement.base_ressources.eval_forfaitaire.

There are many similar examples within benefits

There are also cases where namespacing would be a little cleaner:

  • famille could become prelevements_obligatoires.prelevements_sociaux.cotisations_sociales.travail_prive.famille which is much longer but also, much cleaner.
  • f6de could become prelevements_obligatoires.impot_revenu.charges_deductibles.f6de, which is longer but much more explicit.

But on the other side, I prefer:

  • salaire_net over prelevements_obligatoires.prelevements_sociaux.contributions_sociales.activite.salaire_net
  • rsa over prestations.minima_sociaux.rsa.rsa

(Though writing these two last examples, I'm less annoyed by the namespacing I thought I would be: the first one is way too long, but such a common variable should probably not be burried that deep. On the second one is not that bad).

@fpagnoux
Copy link
Member

fpagnoux commented May 23, 2018

That would probably not work as expected, as the whole point of namespacing is removing any ambiguities while still allowing shorter “last names”.

The usual way this balance is struck in imports in programming languages is to have some sort of “import”.

I guess we could have something like:

from openfisca_france.variables.prestations.minima_sociaux import rsa

...

def formula(famille, period, paremeters):
   ...
   famille(rsa.rsa, period)

Actually, this would not be so hard to implement.

Neither would be maintaining the three options:

from openfisca_france.variables.prestations.minima_sociaux import rsa

...

def formula(famille, period, paremeters):
  famille(rsa.rsa, period)
  famille('rsa', period) # Should raise an error if several` variables are named "rsa" ?
  famille('prestations.minima_sociaux.rsa.rsa', period)

But we would loose a lot of consistency if users started mixing all three styles while coding the legislation.

@MattiSG
Copy link
Member

MattiSG commented May 23, 2018

I guess we could have something like […]

That was my suggestion above, but @benjello raised the following concerns:

  1. it adds a layer of coding complexity
  2. you still have the burden to look for the long fully qualified name at least once

I'm not fully convinced by argument 2, as I believe you have to look for the name of a variable at least once in any case 😉
I'm not sure about argument 1: maybe imports are indeed hard to wrap your head around as an economist. I wonder how much of a bottleneck if would be compared to all other elements you have to learn in OpenFisca 🤔

I personally like how showing the full namespace would also ease navigating the folder structure.

@benjello by any chance, what do you think of that longer version example? Do you still reckon it would be painful?

@benjello
Copy link
Member

My use case is not about coding the legislation is in the everyday use of openfisca and openfisca-farnce derivative product to compute aggregates of taxes, distribution of incomes etc. The easiest way to do this is to pass a list of variables as strings and use simulation.calculate('my_variable', period = some_period) whenever you need to compute something.

So when you use these tools (like this one) or you build them or you build on the top of them the helpers you use everyday (and you hope that they climb the ladder when stabilized), it is important to have a simple way so you can simultaneously:

  • pass the variables between functions
  • use them interactively ie shorten the time you spend typing them

But notice that I am not opposed to using long names when coding the tax and benefit system although I am not sure it would be any convenient to use fully qualified names without being allowed ambiguity.

Here are some responses to the questions raised:

  • @MattiSG : I do not necessarily need to search the full qualified name when looking for a variable the first time, I just search it in the codebase with my editor (it is the most common way around me) and eventually fallback on looking down the hierarchy if I didn't succeed.
  • @fpagnoux : f6de is not a good example it should be renamed. familleis definitively a good one ;-)

I hope this clarifies my position.

@MattiSG
Copy link
Member

MattiSG commented May 28, 2018

Note that supporting variables with same names is the current behaviour and we would like to change it: #562

@fpagnoux
Copy link
Member

I think #562 was more saying that a variable and an entity should not be allowed to have the same name.

Right now, it is already impossible to have two variables with the same name. An error would be raised at the tax and benefit system loading.

@Morendil
Copy link
Contributor

Morendil commented Feb 6, 2019

Closing as stale. We might revisit this after we have considered how variables more generally are handled (see e.g. comments in #810), since namespacing is a second-order feature of managing named entities such as variables, functions or OpenFisca's legislative parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants