Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for TelegramJsonParser #31

Merged
merged 4 commits into from
Nov 10, 2022
Merged

Fix for TelegramJsonParser #31

merged 4 commits into from
Nov 10, 2022

Conversation

galatolofederico
Copy link
Contributor

Messages in my Telegram JSON exports occasionally lack the "from" and "text" attributes. This PR fixes that.

Copy link
Owner

@joweich joweich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one small comment. Thank you for your contribution @galatolofederico!

Comment on lines 243 to 244
and type(mess["text"]) is str
and len(mess["text"]) > 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@galatolofederico Do we need these two additional checks? In which cases would they return false, while the first two checks return true?
If we don't need them, I would drop them for codestyle and performance reasons.

@galatolofederico
Copy link
Contributor Author

Hi honestly type(mess["text"]) is str is a bit of a workaround because it filters only the pure text messages.
text can be either a string if the message contains only text or a list containing strings and dicts.

For example

    {
      "id": ...,
      "type": "message",
      "date": "...",
      "date_unixtime": "...",
      "from": "...",
      "from_id": "...",
      "text": [
        "This is some text followed by ",
        {
          "type": "link",
          "text": "https://a-link"
        }
      ],
      "text_entities": [
        {
          "type": "plain",
          "text": "This is some text followed by "
        },
        {
          "type": "link",
          "text": "https://a-link"
        }
      ]
    },

The second check len(mess["text"]) > 0 is redundant i guess.
Maybe the best approach would be to use text as is, if it is a string otherwise rebuild the original message from the data structure.
Let me know what you think and i will update the PR

@joweich
Copy link
Owner

joweich commented Nov 10, 2022

Agree with your proposal of reconstructing the original message in case the text field is not a native string.
Really happy that you're contributing to this project!🤗

@galatolofederico
Copy link
Contributor Author

Thank you for sharing this amazing project!

@joweich
Copy link
Owner

joweich commented Nov 10, 2022

You might have a look at #18 if you want to leverage your expertise in telegram exports even more!

@joweich joweich merged commit 5a2b92a into joweich:main Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants