-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSV Database filter script #19
Comments
purpose -to filter the input csv data as per the provided arguments filter types
about invalid data filterthe invalid data filter is to be run always before running any other filter, as it eliminates those data elements which have insufficient or invalid data
IMP
steps
|
Hey so when you say "insufficient data" do you mean missing English or Marathi words exclusively, or does it also include missing examples and tags? |
only the main 2 words. 1 en and 1 mr. |
Thanks. Also could you explain what the "all words" filter is supposed to do? |
"All words" basically means no filtering (other than the invalid/insufficient data, of course). |
So I would just call the invalid/insufficient data scripts when the filter type is "All words"? |
Yes. Pretty much. |
Also, the filter by topic function will require the topic as an argument. Do you want me to add an optional topic argument to the main filter function? |
Yes you can do it in whichever way that makes the functions easy to use and also reusable. What I've written in the gen-out.py file is just a basic example. |
pending issues from PR #32 among this priority ones are the # 1 and # 3 |
Program to take a database (csv format currently) as input, keep only the necessary data (as per a filter criteria which is another input), and output this data in the same format as the input database.
The text was updated successfully, but these errors were encountered: