Skip to content
This repository was archived by the owner on Mar 7, 2018. It is now read-only.

Conversation

@c-w
Copy link
Contributor

@c-w c-w commented Jun 22, 2017

No description provided.

@c-w c-w changed the title Add publisher info (required for Cassandra schema) Add more fields required for Cassandra schema Jun 22, 2017
Copy link
Contributor

@jcjimenez jcjimenez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor question.


private def convertToSchema(stream: DStream[BingPost], transformContext: TransformContext): DStream[AnalyzedItem] = {
stream.map(post => AnalyzedItem(
createdAtEpoch = now.getEpochSecond,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Do these incoming posts have a date of their own? If so, do we prefer to use our own capture time as opposed to the timestamp in the post?

Copy link
Contributor Author

@c-w c-w Jun 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have a crawl-date but not a post-date (BingPost.scala). Not sure if that's something that would be more useful to us than a timestamp for which we definitely know what it means.

@c-w c-w merged commit 0716350 into master Jun 23, 2017
@c-w c-w deleted the add-publisher branch June 23, 2017 17:06
@c-w c-w removed the in progress label Jun 23, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants