QIAcuity Concatenator

CAVEAT EMPTOR

I am not a geneticist, but my sister is. From time to time, I help her with data processing by writing small tools that reduce her drudgery work. This repo is one such example.
I make no claims or warranties about the accuracy of this work or the choice of formulas. I am open sourcing it in case it is useful for others.
I have only tested this on a Mac with pyenv and Anaconda and with files from my sister.
I am not a UI person. ;-)
There is very little error handling and no unit tests.

Intro

A simple user interface and processing layer to concatenate together large numbers of files that come from the QIAGEN QIAcuity instrument and do some basic calculations on them.

It is primarily designed to be run locally and interacted with via your browser.

Features

Adds Plate ID to both analysis and occupancy files based off the file name pattern described in the Using section
Concatenates together any number of analysis files generated by the QIAGEN QIAcuity instrument
Concatenates together any number of multiple occupancy files generated by the QIAGEN QIAcuity instrument
Fixes the empty "Well" column header in the Analysis files
Updates/add/modifies the following fields and adds them to the output (see "Explanation of Column Names" below for more details):
- Analysis
  - Replaces "-" in the CI (95%) column with NAN so that computations don't fail
  - Turns CI (95%) into a float instead of a "string percentage"
  - Adds columns for the user supplied Upstream DF (dilution factor), uL into Reaction, Reaction Volume fields
  - Calculates:
    - Concentration in Sample tube (cp/uL)"] = df["Concentration (copies/µL)
    - 95% CI (cp/uL
    - Valid Partitions (%)
    - E
    - Lambda
    - 0.00 (don't ask, I don't fully understand this)
    - 1.00 (don't ask, I don't fully understand this)
    - +1 (don't ask, I don't fully understand this)
    - Partitions with 0 molecules
    - Partitions with 1 molecules
    - Partitions with >1 molecules
- Occupancy (Multiple Occupancy?)
  - Replaces Category values (e.g. YELLOW-CRIMSON-RED) with user provide assay names
  - Joins with a paired analysis file to add the Sample/NTC/Control column and values. This is a "left" join using the "Plate ID" and "Well" as a primary key.

Prerequisites

Python 3.x (I've only tested on 3.9.7)
Familiarity with using the command line.

Some way of managing your Python virtual environments or otherwise installing Python dependencies:

Installing and Running

Pyenv (Optional)

pyenv install 3.9.7
pyenv virtualenv 3.9.7 qiacuity

From the Command Line

One Time Setup

Clone this repository and change to the directory:
1. git clone git@github.com:gsingers/qiacuity-concatenator.git
2. cd qiacuity-concatenator
pyenv activate qiacuity
pip install -r requirements.txt
1. If you are using a Python Virtual Environment, be sure to activate it first (I recommend pyenv, see above). If you don't know what that is, don't worry about it, it should "just work".

Running

pyenv activate qiacuity
If you want to use the default data locations (in the data directory under the current directory): ./run.sh
If you have your data somewhere else, you can pass in the folder locations like: ./run.sh -u /path/to/upload/folder -c /path/to/completed/folder -r /path/to/results/folder
1. For example: ./run.sh -u /tmp/data/uploads -r /tmp/data/results -c /tmp/data/completed

In your browser

http://localhost:5000

Using

The basic workflow is:

Provide input data, using one of two ways:
1. Upload the files you want to merge. Two CSV header types are supported:
  1. Analysis files. The header must be: "","Sample/NTC/Control","Reaction Mix","Target","IC","Control type","Concentration (copies/muL)","CI (95%)","Partitions (valid)","Partitions (positive)","Partitions (negative)","Threshold"
    1. Note: the empty first column name is the Well. The program will try to auto-detect and replace that.
  2. Occupancy files. The header must be: "Well","Hyperwell","Categories","Group","Count","Total","Volume"
2. Since this is running locally on your machine, you can also bulk copy data into the ./data/uploads directory, as in cp /path/to/csv_files /path/to/this/project/data/uploads or using whatever file viewer tool you want (e.g. Finder on the Mac)
3. IMPORTANT: All files must be of the format:
  1. Analysis files: <PLATE_ID>-[USER DEFINED]-analysis.csv, e.g. D123433F-My-Customer-analysis.csv
  2. Occupancy Files: <PLATE_ID>-[USER DEFINED].csv e.g. D123433F-My-Customer.csv
  3. IMPORTANT: For occuupancy files, it is assumed there is a matching "Analysis" file which we can join on to extract the Sample/NTC/Control (see below) value. If the program can't find the matching file, it will return a 400 error code.
  4. IMPORTANT: Files must be encoded either as Windows-1252 or UTF-8.
Click the Start Concatenator link (e.g. http://localhost:3000/concatenate/select_files)
Fill in the form values and select the files you want to process and hit submit
Your results will be in data/results and you can download from the app or you can access them via your file viewer or the command line.

Changing formulas

The main work of merging and calculating vallues is done in concatenate.py in the process_analysis_file method and the process_occupancy_file where we add things like the Plate ID, Sample/NTC/Control, rename some missing columns and calculate some statistics. All of this work is done in Pandas in case you want to change what is calculated.

Cleaning out old files

This program very little file management. We move processed files under the "COMPLETED_FOLDER", but that's about it.
In order to declutter the file listings, you should periodically move the files out of the data directory

Explanation of Column Names

Explanation of columns in the Analysis Concatenated Data file:

Well: 24 well plates are A01-H03; 96 well plates are A01-H12. Samples are loaded in column order.
Sample/NTC/Control: The name of the Sample, NTC, or Control in the well.
Reaction Mix: This identifies the combination of assays used for the sample in question. See Assays used section for details.
Target: The name of the assay being reported. If there are multiplexed assays, the results of each assay will be reported on a separate line.
IC: This column is not used
Type: Identifies Samples vs. controls (This was not annotated in this file)
Concentration in Reaction Mix (cp/ul): This is the number of copies per microliter in the mix as it was loaded onto the instrument. It is calculated based on the number of valid partitions. This value has been corrected by Poisson, but has not had the original dilution factor added to it.
95% CI (%): Confidence interval for the concentration.
Partitions (Valid): The number of partitions that were filled with master mix
Partitions (Positive): The number of partitions that have signal*
Partitions (Negative): The number of partitions that do not have signal*
Threshold: The rfu value setting which distinguishes positive from negative partitions.
Plate ID: Our internal reference name for the run plate
Upstream DF: Dilution factor prior to mixing sample with mastermix
uL into Reaction: The amount of (diluted) template added to the mastermix
Reaction Volume: The volume of reaction mix added to the plate
Concentration in Sample Tube: The concentration in the original sample, prior to dilution. This is the number that should be used for evaluating the data.
95% CI (cp/ul): Confidence interval for the original sample concentration
Valid Partitions (%): The percent of the total possible partitions that were filled
E: The ratio of Positive Partitions/Valid Partitions
Lambda: The -LN of E. Results with a lambda less than 0.01 are likely to contain only one molecule in each partition. This is important when looking at multiple occupancy. These results are highlighted in green.
Fraction with (0, 1, >1) molecule(s): These columns are the fraction or partitions with 0, 1, or >1 molecules/partition. (Calculation:
Expected partitions with 0 molecules: The number of partitions predicted to be negative
Expected partitions with 1 molecule: The number of partitions predicted to have one molecule
Expected partitions with >1 molecule: The number of partitions predicted to have more than one molecule.

Note: These numbers are the raw counts and have not been adjusted by Poisson. The adjusted numbers (not provided) are the numbers used to calculate the concentrations. These values should not be used, as they will be statistically incorrect.

Explanation of columns in the Occupancy Concatenated Data file:

Well: 24 well plates are A01-H03; 96 well plates are A01-H12. Samples are loaded in column order.
Hyperwell: An indicator if the data is from combining multiple wells.
Sample Name: Your sample name
Categories: Description of the order of assays for the “group” column
Group: An indication of which assays are giving signal for the given row. (i.e. ++++ indicates all four assays are giving signal; +- -+ means the signal being reported is from the first assay and the last assay, etc.)
Count: The number of partitions positive for the group
Total: The total number of valid partitions
Volume: The total volume contained in the valid partitions
Plate ID: Our internal reference name for the run plate
Sample ID: The name you provided for your sample

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
concatenator_app		concatenator_app
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

concatenator_app

concatenator_app

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

run.sh

run.sh

Repository files navigation

QIAcuity Concatenator

CAVEAT EMPTOR

Intro

Features

Prerequisites

Installing and Running

Pyenv (Optional)

From the Command Line

One Time Setup

Running

In your browser

Using

Changing formulas

Cleaning out old files

Explanation of Column Names

Explanation of columns in the Analysis Concatenated Data file:

Explanation of columns in the Occupancy Concatenated Data file:

About

Releases

Packages

Languages

License

gsingers/qiacuity-concatenator

Folders and files

Latest commit

History

Repository files navigation

QIAcuity Concatenator

CAVEAT EMPTOR

Intro

Features

Prerequisites

Installing and Running

Pyenv (Optional)

From the Command Line

One Time Setup

Running

In your browser

Using

Changing formulas

Cleaning out old files

Explanation of Column Names

Explanation of columns in the Analysis Concatenated Data file:

Explanation of columns in the Occupancy Concatenated Data file:

About

Resources

License

Stars

Watchers

Forks

Languages