Skip to content
This repository has been archived by the owner on Mar 7, 2018. It is now read-only.

Add a view to power the facts page #165

Merged
merged 2 commits into from
Nov 28, 2017
Merged

Add a view to power the facts page #165

merged 2 commits into from
Nov 28, 2017

Conversation

c-w
Copy link
Contributor

@c-w c-w commented Oct 16, 2017

Test cql session after creating the view:

-- insert data

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e1',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-16',
'2017-10-16', 'facebook', 'someone' );

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e2',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-14',
'2017-10-14', 'facebook', 'someone' );

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e3',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-17',
'2017-10-17', 'facebook', 'someone' );

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e4',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-17',
'2017-10-17', 'twitter', 'someone' );

-- fetch twitter events, returns just e4
SELECT eventid, eventtime FROM eventsbypipeline WHERE
pipelinekey='twitter';

-- fetch facebook events, returns e3, e1, e2
SELECT eventid, eventtime FROM eventsbypipeline WHERE
pipelinekey='facebook';

Copy link
Contributor

@jcjimenez jcjimenez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

AND tileid IS NOT NULL
AND placeid IS NOT NULL
AND conjunctiontopic1 IS NOT NULL
PRIMARY KEY ((pipelinekey), eventtime, eventid, conjunctiontopic1, conjunctiontopic2, conjunctiontopic3, tilez, tileid, placeid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That MV would result into large data partitions, and slow queries.

Please set the PK to the following PRIMARY KEY ((pipelinekey, conjunctiontopic1, conjunctiontopic2, conjunctiontopic3, tilez), eventtime, eventid, tileid, placeid)

when you need to query for facts you can query by a designated tilez = 15. So for example,
Select * from eventplaces where piplelinekey in ('Twitter', 'Facebook') and tilez = 15 and conjunctiontopic1 IN('{INCLUDE ALL TOPIC TERMS}') and conjunctiontopic2 = '' and conjunctiontopic3 = '' and eventtime > '12/21/2017' and eventtime < 12/31/2017.

You may even want to consider setting conjunctiontopic2 = '' and conjunctiontopic3 = '' within the MV where clause.

Copy link
Contributor Author

@c-w c-w Oct 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that, but in that case, do we really need this view? If we restrict events shown by time/topics/etc. how would this component be different from the ActivityFeed component when the map is maximally zoomed out? Would it be simpler to just provide a "maximize" button for the ActivityFeed to make it take over the entire screen instead of adding a new data-source, graphql endpoint and frontend component?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a bit more, maybe a nice way to partition this data would be by date (day/hour-level). In that way, we can implement the infinite scroll very nicely by just requesting progressively farther back dates: every infinite scroll request then maps to a partition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinging @erikschlegel: what's the word on the facts view? Do we really want to implement a separate page for the Facts view if it'll simply surface the same information as the NewsFeed when the map is maximally zoomed out?

I'm happy to implement this, just worried that this will add extra maintenance burden for the same information for which we already have a view.

Test cql session after creating the view:

```cql
-- insert data

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e1',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-16',
'2017-10-16', 'facebook', 'someone' );

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e2',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-14',
'2017-10-14', 'facebook', 'someone' );

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e3',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-17',
'2017-10-17', 'facebook', 'someone' );

INSERT INTO eventplaces( eventid, conjunctiontopic1, conjunctiontopic2,
conjunctiontopic3, tileid, tilez, centroidlat, centroidlon, placeid,
insertiontime, eventtime, pipelinekey, externalsourceid ) VALUES( 'e4',
'foo', 'bar', 'baz', 'tile1', 1, 1.23, 2.34, 'place1', '2017-10-17',
'2017-10-17', 'twitter', 'someone' );

-- fetch twitter events, returns just e4
SELECT eventid, eventtime FROM eventsbypipeline WHERE
pipelinekey='twitter';

-- fetch facebook events, returns e3, e1, e2
SELECT eventid, eventtime FROM eventsbypipeline WHERE
pipelinekey='facebook';
```
@c-w c-w merged commit aa05b05 into master Nov 28, 2017
@c-w c-w deleted the add-facts-view branch November 28, 2017 16:43
rachelnicole pushed a commit that referenced this pull request Jan 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants