Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Institutions cassandra #1013

Merged
merged 31 commits into from
Jun 16, 2017
Merged

Institutions cassandra #1013

merged 31 commits into from
Jun 16, 2017

Conversation

jmarin
Copy link
Contributor

@jmarin jmarin commented Jun 13, 2017

Closes #1012
Doesn't remove PostgreSQL implementation yet

@jmarin jmarin added this to the Sprint 33 milestone Jun 13, 2017
@jmarin jmarin changed the title WIP: Institutions cassandra Institutions cassandra Jun 13, 2017
@codecov-io
Copy link

codecov-io commented Jun 13, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@29021da). Click here to learn what that means.
The diff coverage is 80.95%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1013   +/-   ##
=========================================
  Coverage          ?   93.13%           
=========================================
  Files             ?      291           
  Lines             ?     3830           
  Branches          ?       73           
=========================================
  Hits              ?     3567           
  Misses            ?      263           
  Partials          ?        0
Impacted Files Coverage Δ
api/src/main/scala/hmda/api/HmdaPlatform.scala 0% <0%> (ø)
.../institutions/InstitutionCassandraProjection.scala 0% <0%> (ø)
...ry/src/main/scala/hmda/query/CassandraConfig.scala 100% <100%> (ø)
.../institutions/InstitutionCassandraRepository.scala 100% <100%> (ø)
...la/hmda/query/repository/CassandraRepository.scala 56.25% <56.25%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29021da...1172811. Read the comment docs.

@schbetsy
Copy link
Contributor

Smoke tests for this PR look good, both locally and on Docker-Compose. I still need to take a closer look at the code, but I wanted to say 👍 so far.

I did notice a long keycloak stack trace in the docker-compose logs that started with the following line. I'm not sure it's relevant to this PR, but I wanted to call it out.

keycloak_1     | 14:23:36,222 ERROR [org.keycloak.services] (default task-14) KC-SERVICES0067: 
    failed to parse RestartLoginCookie: java.lang.RuntimeException: 
    java.lang.RuntimeException: java.security.InvalidKeyException: key is null

@jmarin
Copy link
Contributor Author

jmarin commented Jun 14, 2017

Thanks @schbetsy. I would be surprised if that error is related to this PR, maybe it requires a bit more investigation.


retry-unsuccessful-join-after = 20s

//DON'T USE IN PRODUCTION!!!!!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will this value be in production? Why do we need a temporary value here now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto downing of nodes is not recommended in production systems due to the possibility of network failures that form a partition in the cluster. It's useful in development so that the cluster lifecycle (joining, unreachable, leaving, etc. ) can be tested

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Let's make sure there's an open issue in GH to remind us to update that value before we go to production.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created. #1017

cassandra {
host = "127.0.0.1"
host = ${CASSANDRA_CLUSTER_HOSTS}
port = 9142
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes the Cassandra port is set to 9142 and sometimes 9042. I haven't figured out the pattern for which should be used where. What's the logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mistake from a previous try. CassandraUnit uses 9142, but we're not using it when running the application. Will change back to default port


def preparedStatement(implicit session: Session): PreparedStatement = {
session.prepare(s"INSERT INTO $keyspace.institutions" +
s"(id," +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 23-51 don't have substitutions on them, so they probably don't need the s before the ".


class InstitutionCassandraProjection extends InstitutionCassandraRepository {

def startUp(): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm understanding this method correctly. It looks like, when the system starts up (in HmdaPlatform.scala), it uses this method to replay all of the Institution-related events from the read journal into Cassandra.

Is that the case? Can you give any more context to help clarify?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This starts up a stream of events from the journal, yes. The difference from previous implementations (i.e. what happens in validation) is that this stream is running continuously, listening to changes in the journal to push those events to the read model. One part that we haven't done yet (because it's not really 100% clear yet what the requirements are) is storing snapshots as an optimization, so not every event has to be read back to the read model.

session.execute(query)
}

override def insertData(source: Source[InstitutionQuery, NotUsed]): Future[Done] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the insertData and readData methods are currently used in a spec (InstitutionCassandraRepositorySpec), but they aren't called explicitly from anywhere else in the code. Is this another case where they're called implicitly somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not right now. Both of those are convenience methods. Reading will happen when we introduce publication (down the road) and the write method is useful if we need to populate a panel file directly (i.e. for historical data)


cassandra {
host = "127.0.0.1"
port = 9142
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this port also change from 9142 to 9042?


cassandra {
host = "127.0.0.1"
port = 9142
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above about port 9142

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because EmbeddedCassandraServerHelper uses port 9142 for tests

@schbetsy
Copy link
Contributor

I've seen this in the Travis output during the specs, but it doesn't come up when I run the specs locally.

[error] Error during tests:
[error]   Running java with options -classpath /home/travis/build/cfpb/hmda-platform/query/target/scala-2.12/test-classes:/home/travis/build/cfpb/hmda-platform/query/target/scala-2.12/classes:
           [...  I truncated this really long classpath]
           2.10.6/org.scala-sbt/sbt/0.13.13/test-interface-1.0.jar sbt.ForkMain 38993 failed with exit code 137

This is new as of this PR, right? Any idea where it might be coming from?

@schbetsy
Copy link
Contributor

We'll merge this around midday tomorrow, even if Travis doesn't pass. Tests pass locally and smoke tests look good, so we're calling this safe to merge.

build.sbt Outdated
@@ -162,7 +162,8 @@ lazy val query = (project in file("query"))
oldStrategy(x)
},
parallelExecution in Test := false,
libraryDependencies ++= configDeps ++ akkaPersistenceDeps ++ slickDeps
fork in Test := true,
Copy link
Contributor

@schbetsy schbetsy Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this line is causing the java.io.EOFException that causes the "sbt.ForkMain failed with exit code 137" error in Travis.

Is forking necessary in the query project's specs? If so, why?

EDIT: Well, I can't actually confirm that this line is causing the error. But my question still stands.
EDIT2: Anecdotally, I removed fork in Test := true on a test branch, and tests all passed in Travis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Source: I learned more about this error by reading this GH issue from SBT.

@schbetsy schbetsy merged commit 0ef4f2a into cfpb:master Jun 16, 2017
@jmarin jmarin deleted the institutions-cassandra branch June 21, 2017 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants