Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check python convertdate library against existing structured document dates #714

Closed
3 tasks done
rlskoeser opened this issue Mar 9, 2022 · 4 comments
Closed
3 tasks done
Assignees
Labels
🛠️ chore One-off task or update

Comments

@rlskoeser
Copy link
Contributor

rlskoeser commented Mar 9, 2022

dev notes

  • export current data from the database
  • compare historic and converted document dates in the export to dates generated by convertdate for those calendars
  • generate a report of how well they match, and any mismatches so the team can review; also report on any calendars not supported by convertdate

should be done in jupyter or colab so it's easy to review the code and possibly adapt convertdate logic for use in the django application

For reference and more details on calendars, see PGP documentation on using the date fields

@rlskoeser rlskoeser added the 🛠️ chore One-off task or update label Mar 9, 2022
@rlskoeser
Copy link
Contributor Author

@mrustow I wanted to test the python convertdate library against the structured dates that have been entered into the database so far, but when I looked at the documentation I'm not sure how our calendars map to what they support.

The library includes an Islamic calendar module, which supports conversion from Hijri calendar; there is a Persian calendar module, which is described as Solar Hijri calendar, but it looks like this is one is more modern (adopted in 1911).

There's a Hebrew calendar module, although not very well described how to use it!

Would you mind looking at the list of available calendar modules at https://convertdate.readthedocs.io/en/latest/index.html and let us know which ones to use for which of the calendars we currently support in the database? And for any that are not supported by this library, would you reach out to the folks that have the algorithms and see if they will share? If it makes sense, I'd be interested in contributing them to this python library.

@mrustow
Copy link

mrustow commented Mar 11, 2022 via email

@rlskoeser rlskoeser self-assigned this May 5, 2022
@rlskoeser
Copy link
Contributor Author

@mrustow @richmanrachel I've started working on date conversions for Hebrew and Islamic calendars using the convertdate library.

I have generated a preliminary report of the original and converted dates in the database compared with the ones I'm generating — it doesn't handle all cases and there are some errors in the logic, but I thought it would be worth sharing what I've got so far so that you can see how it's working and what kinds of things are causing me problems. (e.g., I'm pretty sure I'm doing something wrong when converting Hebrew years with no month and day because I'm not getting a range.)

I've added some local aliases to map the month names used in our data to the versions in the convertdate library, but I haven't handled all of them (especially the ones that weren't obvious).

date-conversion-report.csv

@rlskoeser
Copy link
Contributor Author

@mrustow you mentioned Coptic here and I see that it's also in the PGP documentation about dates, but it isn't defined as a supported calendar in the code. Should we add it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🛠️ chore One-off task or update
Projects
None yet
Development

No branches or pull requests

2 participants