Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix whatsapp parsing #42

Closed
wants to merge 2 commits into from
Closed

Fix whatsapp parsing #42

wants to merge 2 commits into from

Conversation

Denpeer
Copy link

@Denpeer Denpeer commented Jan 17, 2020

infer_datetime_regex withing whatsapp.py handles single digit month day values in the whatsapp timestamp incorrectly. Causing the parser to ignore a huge amount of messages. This error depends on the first time value encountered.

I'm not quite sure why the regex isn't a constant, but this pull request should fix the problem while maintaining the flexibility of the original function.

Before:
image
After:
image

@MasterScrat
Copy link
Owner

Ping @mar-muel and @manueth who worked on this specific problem!

My understanding is that the date format depends on the locale of the user device, which is why things are a bit complicated.

@manuschn
Copy link
Collaborator

Yes. The format (not just time but also special chars) of the datetime part before the username can differ from export to export.

Further multiline messages can have a date and/or time at the beginning of a line. Therefore, the pattern is determined before the actual parsing.

@Denpeer
Copy link
Author

Denpeer commented Jan 22, 2020

Resolved by #46

@Denpeer Denpeer closed this Jan 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants