Skip to content
This repository has been archived by the owner. It is now read-only.

Add more fields required for Cassandra schema #27

Merged
merged 3 commits into from Jun 23, 2017
Merged

Add more fields required for Cassandra schema #27

merged 3 commits into from Jun 23, 2017

Conversation

@c-w
Copy link
Contributor

@c-w c-w commented Jun 22, 2017

No description provided.

@c-w c-w changed the title Add publisher info (required for Cassandra schema) Add more fields required for Cassandra schema Jun 22, 2017
Copy link
Contributor

@jcjimenez jcjimenez left a comment

LGTM with minor question.

@@ -15,9 +17,11 @@ object BingPipeline extends Pipeline {

private def convertToSchema(stream: DStream[BingPost], transformContext: TransformContext): DStream[AnalyzedItem] = {
stream.map(post => AnalyzedItem(
createdAtEpoch = now.getEpochSecond,
Copy link
Contributor

@jcjimenez jcjimenez Jun 23, 2017

Question: Do these incoming posts have a date of their own? If so, do we prefer to use our own capture time as opposed to the timestamp in the post?

Copy link
Contributor Author

@c-w c-w Jun 23, 2017

They have a crawl-date but not a post-date (BingPost.scala). Not sure if that's something that would be more useful to us than a timestamp for which we definitely know what it means.

@c-w c-w merged commit 0716350 into master Jun 23, 2017
2 checks passed
@c-w c-w deleted the add-publisher branch Jun 23, 2017
@c-w c-w removed the in progress label Jun 23, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants