-
Notifications
You must be signed in to change notification settings - Fork 89
merge data workflow to main #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
38a5cf1
update README (#40)
Fazziekey 59d9dab
support data aligment workflow
Fazziekey 79cc841
Merge pull request #44 from Fazziekey/data_workflow
Fazziekey 5241935
add inference script (#47)
Fazziekey 87dc92f
fix workflow (#49)
Fazziekey ff17065
modify reading data way and add inference test
Gy-Lu 48736a8
sync develop with main
Gy-Lu 49c69bf
fix format
Gy-Lu 7193e8e
fix some format, update readme
Gy-Lu 4bf5dcf
update docker and readme
Gy-Lu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| from .workflow_run import batch_run |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| from .task_factory import TaskFactory | ||
| from .hhblits import HHBlitsFactory | ||
| from .hhsearch import HHSearchFactory | ||
| from .jackhmmer import JackHmmerFactory | ||
| from .hhfilter import HHfilterFactory |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| from ray import workflow | ||
| from typing import List | ||
| from fastfold.workflow.factory import TaskFactory | ||
| from ray.workflow.common import Workflow | ||
| import fastfold.data.tools.hhblits as ffHHBlits | ||
|
|
||
| class HHBlitsFactory(TaskFactory): | ||
|
|
||
| keywords = ['binary_path', 'databases', 'n_cpu'] | ||
|
|
||
| def gen_task(self, fasta_path: str, output_path: str, after: List[Workflow]=None) -> Workflow: | ||
|
|
||
| self.isReady() | ||
|
|
||
| # setup runner | ||
| runner = ffHHBlits.HHBlits( | ||
| binary_path=self.config['binary_path'], | ||
| databases=self.config['databases'], | ||
| n_cpu=self.config['n_cpu'] | ||
| ) | ||
|
|
||
| # generate step function | ||
| @workflow.step | ||
| def hhblits_step(fasta_path: str, output_path: str, after: List[Workflow]) -> None: | ||
| result = runner.query(fasta_path) | ||
| with open(output_path, "w") as f: | ||
| f.write(result["a3m"]) | ||
|
|
||
| return hhblits_step.step(fasta_path, output_path, after) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| import subprocess | ||
| import logging | ||
| from ray import workflow | ||
| from typing import List | ||
| from fastfold.workflow.factory import TaskFactory | ||
| from ray.workflow.common import Workflow | ||
|
|
||
| class HHfilterFactory(TaskFactory): | ||
|
|
||
| keywords = ['binary_path'] | ||
|
|
||
| def gen_task(self, fasta_path: str, output_path: str, after: List[Workflow]=None) -> Workflow: | ||
|
|
||
| self.isReady() | ||
|
|
||
| # generate step function | ||
| @workflow.step | ||
| def hhfilter_step(fasta_path: str, output_path: str, after: List[Workflow]) -> None: | ||
|
|
||
| cmd = [ | ||
| self.config.get('binary_path'), | ||
| ] | ||
| if 'id' in self.config: | ||
| cmd += ['-id', str(self.config.get('id'))] | ||
| if 'cov' in self.config: | ||
| cmd += ['-cov', str(self.config.get('cov'))] | ||
| cmd += ['-i', fasta_path, '-o', output_path] | ||
|
|
||
| logging.info(f"HHfilter start: {' '.join(cmd)}") | ||
|
|
||
| subprocess.run(cmd) | ||
|
|
||
| return hhfilter_step.step(fasta_path, output_path, after) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| from fastfold.workflow.factory import TaskFactory | ||
| from ray import workflow | ||
| from ray.workflow.common import Workflow | ||
| import fastfold.data.tools.hhsearch as ffHHSearch | ||
| from typing import List | ||
|
|
||
| class HHSearchFactory(TaskFactory): | ||
|
|
||
| keywords = ['binary_path', 'databases', 'n_cpu'] | ||
|
|
||
| def gen_task(self, a3m_path: str, output_path: str, after: List[Workflow]=None) -> Workflow: | ||
|
|
||
| self.isReady() | ||
|
|
||
| # setup runner | ||
| runner = ffHHSearch.HHSearch( | ||
| binary_path=self.config['binary_path'], | ||
| databases=self.config['databases'], | ||
| n_cpu=self.config['n_cpu'] | ||
| ) | ||
|
|
||
| # generate step function | ||
| @workflow.step | ||
| def hhsearch_step(a3m_path: str, output_path: str, after: List[Workflow], atab_path: str = None) -> None: | ||
|
|
||
| with open(a3m_path, "r") as f: | ||
| a3m = f.read() | ||
| if atab_path: | ||
| hhsearch_result, atab = runner.query(a3m, gen_atab=True) | ||
| else: | ||
| hhsearch_result = runner.query(a3m) | ||
| with open(output_path, "w") as f: | ||
| f.write(hhsearch_result) | ||
| if atab_path: | ||
| with open(atab_path, "w") as f: | ||
| f.write(atab) | ||
|
|
||
| return hhsearch_step.step(a3m_path, output_path, after) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| from fastfold.workflow.factory import TaskFactory | ||
| from ray import workflow | ||
| from ray.workflow.common import Workflow | ||
| import fastfold.data.tools.jackhmmer as ffJackHmmer | ||
| from fastfold.data import parsers | ||
| from typing import List | ||
|
|
||
| class JackHmmerFactory(TaskFactory): | ||
|
|
||
| keywords = ['binary_path', 'database_path', 'n_cpu', 'uniref_max_hits'] | ||
|
|
||
| def gen_task(self, fasta_path: str, output_path: str, after: List[Workflow]=None) -> Workflow: | ||
|
|
||
| self.isReady() | ||
|
|
||
| # setup runner | ||
| runner = ffJackHmmer.Jackhmmer( | ||
| binary_path=self.config['binary_path'], | ||
| database_path=self.config['database_path'], | ||
| n_cpu=self.config['n_cpu'] | ||
| ) | ||
|
|
||
| # generate step function | ||
| @workflow.step | ||
| def jackhmmer_step(fasta_path: str, output_path: str, after: List[Workflow]) -> None: | ||
| result = runner.query(fasta_path)[0] | ||
| uniref90_msa_a3m = parsers.convert_stockholm_to_a3m( | ||
| result['sto'], | ||
| max_sequences=self.config['uniref_max_hits'] | ||
| ) | ||
| with open(output_path, "w") as f: | ||
| f.write(uniref90_msa_a3m) | ||
|
|
||
| return jackhmmer_step.step(fasta_path, output_path, after) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| from ast import keyword | ||
| import json | ||
| from ray.workflow.common import Workflow | ||
| from os import path | ||
| from typing import List | ||
|
|
||
| class TaskFactory: | ||
|
|
||
| keywords = [] | ||
|
|
||
| def __init__(self, config: dict = None, config_path: str = None) -> None: | ||
|
|
||
| # skip if no keyword required from config file | ||
| if not self.__class__.keywords: | ||
| return | ||
|
|
||
| # setting config for factory | ||
| if config is not None: | ||
| self.config = config | ||
| elif config_path is not None: | ||
| self.loadConfig(config_path) | ||
| else: | ||
| self.loadConfig() | ||
|
|
||
| def configure(self, config: dict, purge=False) -> None: | ||
| if purge: | ||
| self.config = config | ||
| else: | ||
| self.config.update(config) | ||
|
|
||
| def configure(self, keyword: str, value: any) -> None: | ||
| self.config[keyword] = value | ||
|
|
||
| def gen_task(self, after: List[Workflow]=None, *args, **kwargs) -> Workflow: | ||
| raise NotImplementedError | ||
|
|
||
| def isReady(self): | ||
| for key in self.__class__.keywords: | ||
| if key not in self.config: | ||
| raise KeyError(f"{self.__class__.__name__} not ready: \"{key}\" not specified") | ||
|
|
||
| def loadConfig(self, config_path='./config.json'): | ||
| with open(config_path) as configFile: | ||
| globalConfig = json.load(configFile) | ||
| if 'tools' not in globalConfig: | ||
| raise KeyError("\"tools\" not found in global config file") | ||
| factoryName = self.__class__.__name__[:-7] | ||
| if factoryName not in globalConfig['tools']: | ||
| raise KeyError(f"\"{factoryName}\" not found in the \"tools\" section in config") | ||
| self.config = globalConfig['tools'][factoryName] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| from .fastfold_data_workflow import FastFoldDataWorkFlow |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this
loggingconflict withprint?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no