Data Management

Steven Roberts edited this page Mar 16, 2017 · 8 revisions

This page is intended to document all aspects of data management, from the day-to-day, formal NGS and proteomics plans, and general archiving options.

Daily Data on Owl

Data, including intermediate analysis, needs to have a url. This most often means it will live on a Network Attached Storage Device (NAS; aka a server).

Using the Owl NAS to store your data:

  1. Ask Steven or Sam to generate a user account for you. A folder will be created for you in: owl/web/ Ask Steven/Sam for the name of the folder, as well as your username and password.

  2. Upload data to your Owl web folder:

  3. Navigate to

  4. Click on Web Browser login. 1. If it's your first time visiting this page, your browser will present you with a warning about an insecure site or bad certificate. That's OK. Click on the option to add an exception for this site.

  5. Enter username and password. (NOTE: If it's your first time accessing your account, please change your password by clicking on the silhouette in the upper right corner, then "Personal" in the dropdown menu).

  6. Navigate to File Station > web > your_folder (If you don't see the File Station icon, click on the icon of four squares in the upper left corner and select File Station from the subsequent menu).

  7. Click-and-drag files from your computer to your owl/web folder.

Files that you have uploaded to your_folder are publicly viewable:

You can use the URLs for your files for linking in your notebook.


All folders need to contain a readme file.

The readme files should be plain text (i.e. do not create/edit the file with a word processor like Microsoft Word or LibreOffice Writer) and should describe the contents of the folder. If there are directories in the same folder as your readme file, the directory names should be listed and a brief description of their contents should be provided.

Please refrain from using any non alpha-numeric (including spaces) in file and folder names.

NGS Data Management Plan

Raw Data

  1. As sequencing facility provdes data, files are downloaded to our local NAS (owl), in the correct species subdirectory within nightingales.

  2. MD5 checksums are generated and compared to those supplied by the sequencing facility.

  3. Append the generated MD5 checksums to the checksums.md5 file. If that file does not yet exist, create it, and add the generated checksums to the new checksums.md5 file.

  4. The Nightingales Google Spreadsheet is updated.

  5. Each library (i.e. each sample with a unique sequencing barcode) is entered in its own row.

  6. SeqID is the base name of the sequencing file (i.e. no file extensions like ".fq.gz" or ".gz")

  7. Each library receives a unique, incremented Library_ID number.

  8. Each library receives a Library_name; this may or may not be unique.

  9. Update the Nightingales Google Fusion Table with new information from the Nightingales Google Spreadsheet. This is accomplished by:

  10. Deleting all rows in the Nightingales Google Fusion Table (Edit > Delete all rows)

  11. Importing data from the Nightingales Google Spreadsheet (File > Import more rows...)


  • The Google Docs spreadsheet Nightingales Google Spreadsheet is backed up on a regular basis? by downloading tab-delimited file and pushing to LabDocs Repository, with the file name Nightingales.tsv

  • The nightingales directory on owl is backed up to Amazon Glacier. This is accessible by .......?

SRA Upload ...

Proteomics Data Management Plan

Raw Data

  1. As sequencing facility provides data, files are downloaded to our local NAS (owl), in the root phainopepla directory.

As opposed to the NGS data, these data will be organized based on date.

  1. The Spreadsheet is updated.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.