Skip to content

Sage-Bionetworks/Agora

Repository files navigation

Coverage Status Build Status GitHub version

Agora

Prerequisites

What you need to run this app:

  • node and npm (brew install node)

    • Ensure you're running the latest versions Node v16.x.x+ and NPM 8.x.x+
  • A version 6.0+ MongoDB instance running on your local machine

  • You can optionally use a GUI like Compass or Studio3T with your local database

    • Note that only Studio3T is compatible with the AWS DocumentDB instances in Agora's dev, stage and prod environments. Either GUI tool will work with your local Mongo instance.

If you have nvm installed, which is highly recommended (brew install nvm) you can do a nvm install --lts && nvm use in $ to run with the latest Node LTS. You can also have this zsh done for you automatically

Getting Started

1 - Install

# Clone the repo
git clone https://github.com/Sage-Bionetworks/Agora.git

# Go to repo folder
cd Agora

# Install dependencies
npm install

The next sections focus on setting up a mongo database loaded with Agora's data. There are two options:

  • Use a local mongo database that you manually create and populate with data (steps 2-5)
  • Use a containerized mongo database that is pre-populated with data (step 6)

2 - Create local database

You will need to create a MongoDB database and name it agora.

Note: You can use the following scripts to start the database:

# Linux and MacOS
npm run mongo:start

# Windows
npm run mongo:start:windows

3 - Populate local database

Agora's data is stored in json files in the Agora Synapse project, in the following subfolders:

  • Agora Live Data - This folder contains all production data releases, as well as data releases that were never released to production
  • Agora Testing Data - This folder contains test data releases that may not be fully validated
  • Exploratory Data - This folder contains exploratory data files, and subfolders for data releases generated locally via the agora-data-tools ETL tool
  • Mock Data - This folder is reserved for future testing efforts

Note: some data may require joining the Agora Editors team to gain access. Request to join the team here.

The image files surfaced on Agora's Teams page are stored in Synapse here; there is only one set of image files, and the most recent version is always used. The image files aren't considered part of a data release.

The contents of a given data release are defined by a specific version of a data_manifest.json file. The manifest file lists the synID and version of each data file in the release. The manifest files generated by the ETL framework are uploaded to the same Synapse folder as the data files they reference.

To populate your local database, you need to download the appropriate file(s), import them into Mongo, and then index the collections. Each of these steps can be achieved manually, or by using the scripts defined in this project.

If you want to load a set of data files that span multiple Synapse folders, you can do one of the following:

  • Load some or all of the data manually
  • Create a custom manifest file that defines the contents of the custom data release; custom manifest files should be uploaded only to the Exploratory Data folder

Manual data population

It may make sense to populate your database manually when there is no manifest available for the specific set of files you want to load, and/or if you want to load only a small number of new, modified, or exploratory data files.

Each of the required steps can be achieved manually:

You can also combine any of the manual steps with the scripts that perform the other steps.

Using the data population scripts

You can use the data population scripts defined in this repository to download, ingest, and index data. These steps can be performed individually by invoking the commands described in the following sections, or you can use a single command to perform all three steps.

Prerequisites

To populate data into your local database using the scripts defined in this project, you must:

  1. Install the Mongo Database Tools
  2. Install the package manager pip here, if necessary (python 3.4+ ships with pip).
  3. Use pip to install the synapseclient using the following command:
python3 -m pip install synapseclient
  1. Create a Synapse PAT as described here
  2. Add your PAT to .synapseConfig as described here
  3. If necessary, install wget, which is a dependency of the data population scripts.
  4. If necessary, add python3 to your path, as described here
SHELL_SETTINGS_FILE=~/".bash_profile"
echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> "${SHELL_SETTINGS_FILE}"
source "${SHELL_SETTINGS_FILE}"
  1. Confirm that synapse package and credentials are properly configured:
synapse login
Provisioning a local database with a single command

Use this command to sequentially download data and image files, ingest those files, and index the collections in your local db; you will be prompted to provide information about the manifest file that you want to use:

npm run data:local:mongo
Downloading data and image files

Use this command to download data and image files from synapse to the ./local/data folder in this project; you will be prompted to provide information about the manifest file that you want to use:

npm run data:local
Importing data and image files

Use this command to import the data and image files in your ./local/data folder:

# Imports all data files and team images
npm run mongo:import
Indexing Mongo collections

Use this command to add indexes:

# Creates indexes
npm run mongo:create:indexes

You'll need Linux to run the previous scripts. If you need to do this in Windows, you can get any Linux distribution at the Windows Store (e.g. Ubuntu).

4 - Build using local database

# Build the server and app
npm run dev

5 - Start using local database

# Start the server and app
npm run start

Go to http://localhost:8080

6 - Use containerized database

  1. Install Docker, if necessary.
  2. Update data-file and data-version in package.json to reflect the desired data release version, if necessary.
  3. Create an environment file: npm run create-env.
  4. Start the containerized database: npm run docker:db:start. The necessary images will be pulled from GHCR. If you would like to use a different image, update DATA_IMAGE_PATH in .env. If the desired image does not exist, see steps below to create the desired image.
  5. Run the server and app against the containerized database: npm run docker:dev.
  6. Stop the containerized database: npm run docker:db:stop.

Creating an image for a new data release

A "data release" is defined in the package.json by the data-file and data-version values. Images pre-loaded with data from the data release are created when the ci.yml GitHub Action workflow runs and are pushed to the GitHub Container Registry (GHCR) package for that namespace -- the sage-bionetworks organization namespace when the workflow runs in the base repo or in the user's namespace (e.g. hallieswan) when running in a forked repo.

The sage-bionetworks package will contain images for data releases that have been specified in package.json on develop or main. The user's package will contain images for data releases that have been specified in package.json in branches pushed to their fork.

If a dev needs to create an image for a data release that does not yet exist in the Sage-Bionetworks package, they should follow these steps:

  1. Create a new branch.
  2. Update the package.json to reflect the appropriate data-file and data-version files.
  3. If necessary, update ./scripts/collections.csv to specify new collections and ./scripts/mongo-create-Indexes.js to specify new indexes.
  4. Commit the changes.
  5. Push the changes to your remote fork to trigger a run of the ci.yml workflow.
  6. The new image will be available in your user namespaced GHCR package, e.g. https://github.com/hallieswan/Agora/pkgs/container/agora-data-nonmonorepo.
  7. Update your local .env file so DATA_IMAGE_PATH points to the newly created image, e.g. ghcr.io/hallieswan/agora-data-nonmonorepo:syn13363290.68.
  8. Start the containerized database: npm run docker:db:start.

Development

# Build the server and app and watch for changes
npm run dev

Go to http://localhost:8080

Testing

# Run unit tests
npm run test

# Run unit tests and watch for changes
npm run test:watch

# Run end-to-end tests (requires build)
npm run e2e

Deployment

Commit changes

Before pushing code to the dev branch, we should follow these steps to make sure everything is running without errors.

# Clean everything
npm run clean

# Re-install dependencies
npm install

# Run unit tests
npm run test

# Build app and server
npm run build

# Run end-to-end tests
npm run e2e

# Go to localhost:8080 and verify the app is running without errors
npm run start

Continuous Deployment

We have set up Travis to deploy Agora to our AWS infrastructure. We continuously deploy to three environments:

Deployment Workflow

To deploy Agora updates to one of the environments just merge code to the branch you would like to deploy to then Travis will take care of building, testing and deploying the Agora application.

Deployment configurations

Elastic beanstalk uses files in the .ebextensions folder to configure the environment that the Agora application runs in. The .ebextensions files are packaged up with Agora and deployed to beanstalk by the CI system.

Deployment Builds

Deployment for New Data (Updated 1/31/23)

  1. Ensure the new data files are available in the Synapse Agora Live Data folder.
  2. Determine the version number of the data_manifest.csv file to use for the data release:
    1. The manifest must specify the appropriate version of each json file for the data release
    2. If a suitable data_manifest.csv does not exist, you can manually generate one and upload it to Synapse
  3. Update data version in data-manifest.json in Agora Data Manager. (example):
    1. The version should match the version of the desired data_manifest.csv file in Synapse. Note that the manifest references itself internally, but the version in that internal reference is always off by one. Use the manifest version surfaced via the Synapse UI, not the one in the manifest.
  4. If there is a new json file (i.e. not updating existing data):
    1. add an entry for the new file to agora-data-manager's import-data.sh script. (example)
    2. add an entry for the new collection to agora-data-manager's create-indexes.sh script (example)
    3. add an entry for the new file to ./scripts/mongo-import.sh in Agora (this repository)
    4. add an entry for the new collection to ./scripts/mongo-create-indexes.js (this repository)
  5. Merge your changes to Agora Data Manager to the develop branch.
  6. Verify new data is in the database in the develop environment; see Agora environments for information about connecting to our AWS DocumentDB instances
  7. Update data-version in package.json in Agora (this repository). (example) The version should match the data_manifest.csv file in Synapse. Then merge the change to Agora's develoc branch.
  8. Check new data shows up on Agora's dev branch.
  9. Check new data version shows up in the footer on Agora's dev branch.
  10. Once verified in the develop environment, you can promote the data release to staging by:
    1. Merging the Agora Data Manager develop branch to the staging branch
    2. Verifying the new data is in the staging environment's database
    3. Merging the Agora develop branch to staging
    4. Verifying the new data and data version in the staging environment
  11. To promote to production, repeat step 10 but merge the staging branches to the prod branches

New Data Testing

  1. New or updated data is usually in json format. Make sure you have MongoDB installed locally as described in "Running the app" section. Then you can import your json file to your local MongoDB database by running:
mongoimport --db agora --collection [add collection name here] --jsonArray --drop --file [path to the json file name]

More examples can be found here.

  1. Verified the data is successfully imported to the database. You may do that by using a GUI for MongoDB. The connection address to MongoDB in your local machine is localhost and the port number is 27017.

Style Guide and Project Structure

This project follows the directions provided by the official angular style guide. Things that the guide state to keep in mind:

  • Define components or services that do one thing only, per file. Try to use small sized functions where possible, making it reusable.

  • Keep the consistency in file and folder names. Use dashes to separate words in the descriptive prefix name and dots to separate the suffix words. Use the type and extension names in the file name, e.g. a.component.ts, a.service.ts or a.module.ts. The style guide has references about naming the other types of files in an Angular project.

  • Use camel case for variable names, even for constants as they are easy to read. If the values don't change, use a const declaration. For Interfaces use an upper camel case, e.g. MyInterface.

  • The guide advises separating application from third party imports. This projects goes one step further separating imports by source and purpose also, grouping Angular framework, project components and services, third party typescript/javascript libraries separately.

  • The folder structure in not restrictive in the style guide, but it should be structured in a way so it is to maintain and expand the project, and identify files in a glance. This project uses a root folder called src and one main folder for each module. When a spacific folder reaches seven or more files it is split into sub-folders. Another reason to split is to keep a view smart component with container dumb components as children.

  • For the file structure this project uses the component approach. This is the new standard for developing Angular apps and a great way to ensure maintainable code by encapsulation of our behavior logic. A component is basically a self contained app usually in a single file or a folder with each concern as a file: style, template, specs, e2e, and component class.

External Stylesheets

Any stylesheets (Sass or CSS) placed in the src/styles directory and imported into your project will automatically be compiled into an external .css and embedded in your production builds.

For example to use Bootstrap as an external stylesheet:

  1. Create a styles.scss file (name doesn't matter) in the src/styles directory.
  2. npm install the version of Boostrap you want.
  3. In styles.scss add @import 'bootstrap/scss/bootstrap.scss';
  4. In src/app/core/core.module.ts add underneath the other import statements: import '../styles/styles.scss';

Since we are using PrimeNG, style rules might not be applied to nested Angular children components. There are two ways to solve this issue enforce style scoping:

  • Special Selectors

You can keep the Shadow DOM (emulated browser encapsulation) and still apply rules from third party libraries to nested children with this approach. This is the recommended way, but it is harder to implement in certain scenarios.

:host /deep/ .ui-paginator-bottom {
    display: none;
}
  • Disable View Encapsulation

This is the easiest way to apply nested style rules, just go to the component and turn off the encapsulation. This way the rules are passed from parent to children without problems, but any rule created in one component affects the other components. This project uses this approach, so be aware to create style classes with using names related to the current component only.

...
import { ..., ViewEncapsulation } from '@angular/core';

@Component {
...
encapsulation: ViewEncapsulation.None,
}

AoT Don'ts

The following are some things that will make AoT compile fail.

  • Don’t use require statements for your templates or styles, use styleUrls and templateUrls, the angular2-template-loader plugin will change it to require at build time.
  • Don’t use default exports.
  • Don’t use form.controls.controlName, use form.get(‘controlName’)
  • Don’t use control.errors?.someError, use control.hasError(‘someError’)
  • Don’t use functions in your providers, routes or declarations, export a function and then reference that function name
  • @Inputs, @Outputs, View or Content Child(ren), Hostbindings, and any field you use from the template or annotate for Angular should be public

Configuration

Configuration files live in config/ we are currently using webpack, karma, for different stages of your application

License

MIT