Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move context querystring to end of JS-generated beacons #204

Closed
shermozle opened this issue May 9, 2014 · 4 comments
Closed

Move context querystring to end of JS-generated beacons #204

shermozle opened this issue May 9, 2014 · 4 comments
Assignees
Labels
type:enhancement New features or improvements to existing features.
Milestone

Comments

@shermozle
Copy link

Context variables encourage us to shove just about anything into the beacons. Trouble is, that very rapidly ends up being quite long once the base64 encoding kicks in.

IE has a URL limit of 2083 characters and truncates anything beyond that. Context variables could very easily end up causing the URL to be truncated.

Workaround is to move the cx querystring to the end of URL so that only the Context items hopefully get truncated.

@alexanderdean
Copy link
Member

A lot of sense in that. And the contexts will get more verbose once we add vendoring and versioning into them.

@shermozle , Fred has been doing a lot of work exploring batched sends and POST support in the Python tracker, see e.g. https://github.com/snowplow/snowplow-python-tracker/blob/feature/0.4.0/snowplow_tracker/consumer.py#L134

This is obviously going to be the dominant way of collecting events in @jonalmeida's new iOS tracker: https://github.com/snowplow/snowplow-ios-tracker/tree/develop

@shermozle do you have any thoughts on batched sends with POST for the JS Tracker? That will obviously fix any GET querystring size issues. The relevant ticket is here: #168

@shermozle
Copy link
Author

You can't do POST across domains (same origin policy) without user
interaction so it's a non starter for js collection.

How about a unique event id and splitting the beacons around 2000 chars
then reassemble in ETL? I can see all kinds of reasons this might be needed
anyway (see what I just posted to the video tracking ticket).

Simon Rumble simon@simonrumble.com
On May 9, 2014 5:28 PM, "Alexander Dean" notifications@github.com wrote:

A lot of sense in that. And the contexts will get more verbose once we add
vendoring and versioning into them.

@shermozle https://github.com/shermozle , Fred has been doing a lot of
work exploring batched sends and POST support in the Python tracker, see
e.g.
https://github.com/snowplow/snowplow-python-tracker/blob/feature/0.4.0/snowplow_tracker/consumer.py#L134

This is obviously going to be the dominant way of collecting events in
@jonalmeida https://github.com/jonalmeida's new iOS tracker:
https://github.com/snowplow/snowplow-ios-tracker/tree/develop

@shermozle https://github.com/shermozle do you have any thoughts on
batched sends with POST for the JS Tracker? That will obviously fix any GET
querystring size issues. The relevant ticket is here: #168#168


Reply to this email directly or view it on GitHubhttps://github.com//issues/204#issuecomment-42640164
.

@shermozle
Copy link
Author

That's #155

Simon Rumble simon@simonrumble.com
On May 9, 2014 6:16 PM, "Simon Rumble" simon@simonrumble.com wrote:

You can't do POST across domains (same origin policy) without user
interaction so it's a non starter for js collection.

How about a unique event id and splitting the beacons around 2000 chars
then reassemble in ETL? I can see all kinds of reasons this might be needed
anyway (see what I just posted to the video tracking ticket).

Simon Rumble simon@simonrumble.com
On May 9, 2014 5:28 PM, "Alexander Dean" notifications@github.com wrote:

A lot of sense in that. And the contexts will get more verbose once we
add vendoring and versioning into them.

@shermozle https://github.com/shermozle , Fred has been doing a lot of
work exploring batched sends and POST support in the Python tracker, see
e.g.
https://github.com/snowplow/snowplow-python-tracker/blob/feature/0.4.0/snowplow_tracker/consumer.py#L134

This is obviously going to be the dominant way of collecting events in
@jonalmeida https://github.com/jonalmeida's new iOS tracker:
https://github.com/snowplow/snowplow-ios-tracker/tree/develop

@shermozle https://github.com/shermozle do you have any thoughts on
batched sends with POST for the JS Tracker? That will obviously fix any GET
querystring size issues. The relevant ticket is here: #168#168


Reply to this email directly or view it on GitHubhttps://github.com//issues/204#issuecomment-42640164
.

@alexanderdean
Copy link
Member

How about a unique event id and splitting the beacons around 2000 chars then reassemble in ETL? I can see all kinds of reasons this might be needed anyway (see what I just posted to the video tracking ticket).

That's actually one of Snowplow's oldest tickets! snowplow/snowplow#20 I have closed it with the reasons why it is so difficult.

You can't do POST across domains (same origin policy) without user interaction so it's a non starter for js collection.

We have had a fork of the Snowplow JS Tracker for a cross-domain ad tech client in production since last year using async CORS POST for data collection, falling back to JSONP GET if CORS doesn't work. The CORS setup on the collector is a bit fiddly but in practice works fully for 82.03% browsers: http://caniuse.com/cors

For the Snowplow JS Tracker we could attempt CORS POST, and if that fails, fall back to vanilla GET. It's easy to implement in the JS Tracker and we are doing all the Collector->Storage support for this anyway for native-mobile and server-side tracking.

Separately, note that today Snowplow will reject a row which has a truncated or otherwise invalid JSON:

https://github.com/snowplow/snowplow/blob/master/3-enrich/scala-common-enrich/src/main/scala/com.snowplowanalytics.snowplow.enrich/common/utils/JsonUtils.scala#L66

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:enhancement New features or improvements to existing features.
Projects
None yet
Development

No branches or pull requests

3 participants