Skip to content
Use cloud based HPC to run bioinformatics workflows using NeatSeq-Flow
Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Install_script.sh
LICENSE
README.md

README.md

NeatSeq-Flow-In-The-Cloud

Use cloud based HPC to run bioinformatics workflows using NeatSeq-Flow

For more information about "NeatSeq-Flow" see the full documentation on Read The Docs

Note: For now we have tested NeatSeq-Flow on Amazon cloud (AWS ParallelCluster) using the SGE HPC scheduler.

To use NeatSeq-Flow on Amazon cloud using AWS ParallelCluster you will need:

  1. Set-up a AWS ParallelCluster and choose a SGE HPC scheduler. Follow the information here or here and change slurm to sge
  2. SSH to your Master Node
  3. Go (cd) to your shared storage
  4. Type:
      wget https://raw.githubusercontent.com/bioinfo-core-BGU/NeatSeq-Flow-In-The-Cloud/master/Install_script.sh
      sh Install_script.sh setup_conda
      sudo Install_script.sh setup_sge 
  5. Type:
      dirname $(which qsub)
  6. Type:
      source activate NeatSeq_Flow
      NeatSeq_Flow_GUI.py
  7. In the NeatSeq_Flow GUI in the Cluster Tab within the WorkFlow Tab, edit the 'Qsub_q' value to 'all.q' and 'Qsub_path' value to the result in step 5
  8. Open a new terminal
  9. Before running a workflow activate the cluster:
      pcluster start mycluster
  10. Wait for at least one working Node is active, type:
       qhost
    to view the active Nodes, repeat steps 9 and 10 if no active Nodes available
  11. To stop the cluster:
   pcluster stop mycluster
You can’t perform that action at this time.