Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event must be a string #2920

Merged
merged 13 commits into from Jan 15, 2021
Merged

Event must be a string #2920

merged 13 commits into from Jan 15, 2021

Conversation

mariusandra
Copy link
Collaborator

Changes

  • Prevents storing random JSON blobs in the team.event_names array.
  • Those random objects break the event search in the frontend
  • Occurred with posthog-js pre-1.8.3 (fixed since), when calling posthog.capture({ object: 'bla' }).
  • Happened for a user
  • Both postgres and clickhouse error after the broken event is stored in event_names with something like:
# with clickhouse
raise TypeError(message)
TypeError: {'event': 'bla', 'properties': {'a': 'b'}} has type <class 'dict'>, but expected one of: (<class 'bytes'>, <class 'str'>)
  • Because of that protobuf error I also check for bytes and not only str, but we could only check str here. I'm not sure what might come over the wire and if checking just str would break anything or not. I'm willing to change this, but might not be needed.
  • Test added as well

Checklist

  • All querysets/queries filter by Organization, by Team, and by User
  • Django backend tests
  • Jest frontend tests
  • Cypress end-to-end tests

@timgl timgl temporarily deployed to posthog-event-name-stri-fixl7d January 12, 2021 13:43 Inactive
@@ -67,6 +67,8 @@ def _alias(previous_distinct_id: str, distinct_id: str, team_id: int, retry_if_f
def store_names_and_properties(team: Team, event: str, properties: Dict) -> None:
# In _capture we only prefetch a couple of fields in Team to avoid fetching too much data
save = False
if not isinstance(event, str) and not isinstance(event, bytes):
raise TypeError("Event must be a string")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be returned to the user submitting the event, it'll just show up in Sentry. If I think what I'd want for debugging purposes it'd be better if we just convert objects to strings (max 200 characters) so that I can at least see them come in and fix whatever happens, rather than have a silent failure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, updated!

@mariusandra mariusandra temporarily deployed to posthog-event-name-stri-fixl7d January 12, 2021 15:18 Inactive
@mariusandra mariusandra temporarily deployed to posthog-event-name-stri-fixl7d January 12, 2021 15:55 Inactive
@mariusandra mariusandra temporarily deployed to posthog-event-name-stri-fixl7d January 12, 2021 17:15 Inactive
@mariusandra mariusandra temporarily deployed to posthog-event-name-stri-fixl7d January 12, 2021 18:13 Inactive
@mariusandra mariusandra temporarily deployed to posthog-event-name-stri-fixl7d January 12, 2021 18:25 Inactive
@mariusandra
Copy link
Collaborator Author

@timgl this is good for another look. I'd merge directly due to your approval, but some things changed:

  • Since I'm locally using python 3.9, I had to upgrade mypy to get it to work, this required changing some unrelated types. Mainly request.user is now Union[AnonymousUser, User]. It required a tweak in decide.py and an # ignore in utils.py, which used a try/except block to catch the AnonymousUser.team error. There was some other issue with urls.py, which I seem to have fixed (tested locally, all urls still work like they should)
  • I added a migration to update team.event_names in case it contains non-strings. I coded it in a very defensive way, making it only run if there's something to update and also only importing the calculation function if something indeed needs to be changed. This way if we ever move the imports around (e.g. delete the calculate_event_property_usage_for_team function due to some future situation), the migrations won't all suddenly just fail. I could have dispatched a background task as well, but somehow starting background jobs in migrations seems weirdly non-deterministic and I'd opt to avoid that.

@mariusandra mariusandra temporarily deployed to posthog-event-name-stri-fixl7d January 12, 2021 18:36 Inactive
@Twixes Twixes self-requested a review January 15, 2021 11:54
@mariusandra mariusandra temporarily deployed to posthog-event-name-stri-fixl7d January 15, 2021 14:10 Inactive
@mariusandra mariusandra merged commit 99e74da into master Jan 15, 2021
@mariusandra mariusandra deleted the event-name-string branch January 15, 2021 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants