Skip to content

Commit

Permalink
add code comments
Browse files Browse the repository at this point in the history
  • Loading branch information
davidangb committed Jun 3, 2019
1 parent 1b92b1c commit df53ff0
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,15 @@ class ElasticSearchTrialDAO(client: TransportClient, indexName: String, refreshM
}

private def startScroll: (String, List[TrialProject]) = {
/* Start an Elasticsearch scroll request. This allows us to retrieve all records in the index,
in batches of N records.
We currently have N set to 250.
Dear future engineer: if you find that you ever need to change this value, even
just to experiment with other values, or to use different values in different environments,
please move it to config so that it is easier to tweak.
*/
val startScrollRequest = client
.prepareSearch(indexName)
.setQuery(Ready)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,13 @@ trait TrialServiceSupport extends LazyLogging {
val assignedUsers:List[String] = projects.toList.filter(p => p.user.isDefined).flatMap(_.user.map(_.userSubjectId))
logger.info(s"makeSpreadsheetValues found ${assignedUsers.length} users assigned to projects ...")

// split users into chunks of size N for efficient querying to Thurloe
/* split users into chunks of size N for efficient querying to Thurloe
based on empirical perf/scale testing, the most efficient value for N is ~40.
Dear future engineer: if you find that you ever need to change this value, even
just to experiment with other values, or to use different values in different environments,
please move it to config so that it is easier to tweak.
*/
val profileQueries = assignedUsers
.grouped(40) // tweak chunk size here!
.map{ userChunk => thurloeDAO.bulkUserQuery(userChunk, thurloeKeys) }
Expand Down

0 comments on commit df53ff0

Please sign in to comment.