Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the category_description field #582

Open
hellais opened this issue Feb 21, 2020 · 3 comments
Open

Remove the category_description field #582

hellais opened this issue Feb 21, 2020 · 3 comments
Labels
discuss review-needed Tickets that need a review from Citizenlab people

Comments

@hellais
Copy link
Collaborator

hellais commented Feb 21, 2020

It's redundant, as it's already defined in the category code list, error prone, contributors need to manually copy paste it to avoid breakage and wasting space in the test list files.

Moreover when things like: #314 pop up we need to reprocess all the lists files.

I suggest we drop it entirely, but to prevent it breaking with consumers of the data, I am thinking we could rather redefined it as something like: #380 or similar.

I suggest we leave this issue open for some time before doing it so that consumers of the data can express some concerns on how it may break their usage of the test-lists.

Though we should set a hard deadline of when we will proceed with doing this if we decide it's a good idea.

@hellais hellais added the review-needed Tickets that need a review from Citizenlab people label Feb 21, 2020
@jakubd
Copy link
Member

jakubd commented Feb 26, 2020

I agree it is redundant, since it is the same as simply doing a vlookup or a join on the 00-LEGEND csvs and can introduce data avenues for inconsistencies on those that add/update the list as well as people relying on the long form category_description in parsing. I also share the concern here that removal of a column can break existing parsing of lists. However I am not sure if redefinition would be solve that particular problem. In my view the same work would be required to be spent in any situation where we remove/change existing columns. As opposed to adding an additional column to the end which I don't believe should break most list parsing code.

All the use cases are different and I know there are many consumers of these lists we need to consider. I agree with this in spirit but think we need both caution and to give other list consumers a chance to chime in.

@bact
Copy link
Contributor

bact commented Mar 19, 2020

Support the removal of the category_description field.

This will also reduce the file size and also make it easier to quickly modify the list on a text editor (as each line will be shorter and more likely to fit in a screen).

In terms of the number of columns / column position and parsing, can the priority field that may get introduced (see #590 and ooni/ooni.org#431) replace the position of category_description?

In this fashion, all other columns will remain the same. Software that still consume category_description may get priority value instead, but it will not break the parsing.

@bact
Copy link
Contributor

bact commented Oct 11, 2022

Come back to support this motion again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss review-needed Tickets that need a review from Citizenlab people
Projects
None yet
Development

No branches or pull requests

3 participants