Skip to content

Server and Search Architecture

Scott Halstvedt edited this page Jun 24, 2015 · 2 revisions

AskNature 2.0 - Server and Search Architecture

Architecture

AN 2.0 runs on Node.js and relies heavily on the concept of model generators to produce API results. Model generators are query builders that are initialized with the class of an object and its attributes and relationships, and can then fetch and mutate the database while enforcing the provided schema. This schema can be provided at configure-time with a static schema in server/models or at request-time using ANQL. Usage of model generators is detailed in the section Abstractions and Model below.

Components and libraries

The components and libraries used on the server are listed in the page "The Stack" on this wiki.

Abstractions and Model

Generators look like this: ConstructModel = function(entityName, fields, relationships) where entityName is the class of the object in the database, fields is an array of fields to include in the schema, and relationships is an array of generated Model objects for related entities.

Secrets storage

All API keys and passwords are stored on the server in a secrets.json file at server/config/secrets.json. The structure of the file is as follows:

{
  "server" : {
    "host": <database-host>,
    "port": 2424,
    "httpPort": 2480,
    "username": "root",
    "password": <orientdb-root-password>,
    "enableRIDBags": false
  },
  "database" : {
    "name": <orientdb-database>,
    "username": <orientdb-username>,
    "password": <orientdb-password>
  },
  "passport_google" : {
    "clientID": <google-id>,
    "clientSecret": <google-secret>,
    "callbackURL": "http://asknatu.re:8080/auth/google/callback"
  },
  "passport_facebook" : {
    "clientID": <facebook-id>,
    "clientSecret": <facebook-secret>,
    "callbackURL": "http://asknatu.re:8080/auth/facebook/callback"
  },
  "passport_linkedin" : {
    "clientID": <linkedin-id>,
    "clientSecret": <linkedin-secret>,
    "callbackURL": "http://asknatu.re:8080/auth/linkedin/callback"
  },
  "sendgrid" : {
    "auth": {
        "api_user": <sendgrid-user>,
        "api_key": <sendgrid-key>
    }
  },
  "s3" : {
    "accessKeyId": <s3-id>,
    "secretAccessKey": <s3-secret>,
    "bucket": <s3-bucket>
  }
}

Graph search theory and flow

AskNature 2.0 contains a novel graph search algorithm. This algorithm has two components; search and clustering. These can be used to gather search results based on relationships with matching objects of other classes, and to group search results based on the most prominent relationships that are recorded. For example, a search for "move" on Strategy objects over its Function relationships grouped by Function will search all Strategies for that keyword and also search Function objects for that keyword and propagate that score to all related Strategies. The resulting Strategies will then be grouped into their related Functions with the ranking of the Functions determined by the composite Lucene score of the Function and the Strategies it contains. Strategies that are tied to multiple Functions will appear multiple times in the result tree.

Search

The search phase extends built-in functionality to search using a Lucene index by performing the query on multiple related classes of objects and then propagating score through relationships with an inverse-square falloff over number of hops. Basically, this means that searching for a Strategy will do a traditional Lucene query on Strategies, but also search Functions and propagate a score down to the Strategies proportional to the degree of match.

Clustering

The clustering phase operates on the final, scored list of the class that's being searched. The scores are propagated back to related categories in composite with the original Lucene score of the clusters, if defined.