-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added new subcommand to merge TFBS in a subset of regions #268
base: dev
Are you sure you want to change the base?
Conversation
required.add_argument("--bindetect", "--TFBS", "--input", type=os.path.abspath, dest="tfbs", help="Path to the output directory of BINDetect containing all TFBS files.") | ||
required.add_argument("--regions", help="Path to the query regions bed file.", type=os.path.abspath, dest="regions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
required.add_argument("--bindetect", "--TFBS", "--input", type=os.path.abspath, dest="tfbs", help="Path to the output directory of BINDetect containing all TFBS files.") | |
required.add_argument("--regions", help="Path to the query regions bed file.", type=os.path.abspath, dest="regions") | |
required.add_argument("--bindetect", "--TFBS", "--input", required=True, type=os.path.abspath, dest="tfbs", help="Path to the output directory of BINDetect containing all TFBS files.") | |
required.add_argument("--regions", required=True, help="Path to the query regions bed file.", type=os.path.abspath, dest="regions") |
You have to add required=True
otherwise parameters are considered optional.
|
||
#Required arguments | ||
required = parser.add_argument_group('Required arguments') | ||
required.add_argument("--bindetect", "--TFBS", "--input", type=os.path.abspath, dest="tfbs", help="Path to the output directory of BINDetect containing all TFBS files.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add three different names for this parameter? This could be confusing for some. Consider choosing one and an abbreviation e.g. input
and i
|
||
#Optional arguments | ||
optional = parser.add_argument_group('Optional arguments') | ||
optional.add_argument( "--output", default='./merged_TFBS_subset.xlsx', help="Path for output file. If file name ends with .bed, no header column will be added.", type=os.path.abspath, dest="output") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Use a plain txt file as default output for example tsv or bed.
- Also, add the default value to the help message. See the parameters of other functions, match their style and check if other parameters should have a default in the help description.
optional.add_argument("--TFs", help="Path to the file containing the list of TFs to subset.", type=os.path.abspath, dest="TF", default=None) | ||
optional = add_logger_args(optional) | ||
|
||
return(parser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return(parser) | |
return(parser) | |
Git wants to have an empty line at the end of each file.
parser.print_help() | ||
sys.exit() | ||
|
||
run_submerge(args.tfbs, args.regions, args.TF, args.output, args.verbosity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not?
run_submerge(args.tfbs, args.regions, args.TF, args.output, args.verbosity) | |
run_submerge(args) |
command = f"bedtools intersect -a {regions} -b {headless_file} -wa -wb" | ||
intersection = subprocess.check_output(command, shell=True).decode("utf-8") | ||
all_intersections += intersection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use pybedtools
if args.output.endswith(".bed"): | ||
pass | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if args.output.endswith(".bed"): | |
pass | |
else: | |
if not args.output.endswith(".bed"): |
logger.end() | ||
|
||
|
||
def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main function is not really needed. Since it is not much you could just add the code to the "main-if". However, this comes down to preference consider this a comment and keep it as you like.
# remove 'chr' str from all chr columns TODO not optimal as contigs may be named differently, i.e. 'contig' or 'chrIV' | ||
df["query chr"] = df["query chr"].str.replace("chr", "") | ||
df["TFBS_chr"] = df["TFBS_chr"].str.replace("chr", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do this for sorting purposes alone consider using the key
parameter instead.
df.to_csv(args.output, sep="\t", index=False) | ||
|
||
if args.output.endswith(".xlsx"): | ||
df = pd.read_csv(args.output, sep="\t") | ||
df.to_excel(args.output, index=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are writing the output file then you read it back in and then you overwrite it as an excel file? What? Please explain :D
No description provided.