-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output file path #10
Comments
Thanks for you questions. Currently, we are unable to specify the output file path. However, we plan to update PhaGCN2.3 within the next two months. This update will not only refresh the database to align with the latest ICTV tables, but it will also enhance the output file path options and improve the visualization of network graphs. |
The inability to select the output directory is a big problem. As a result, the output of this program is all in the running directory. The running directory can only be PhaGCN2.0, and multiple tasks cannot be run at the same time. It is recommended to optimize this issue first, and then consider the optimization of the latest ICTV later. |
@sleepwell-zhd , you can try this way: 安装git clone https://github.com/KennthShang/PhaGCN2.0.git
cd PhaGCN2.0
rm supplementary\ file/ __pycache__/ pred/ final_prediction.csv -rf
vi run_KnowledgeGraph.py # 把第169行注释掉,因为不需要再建一次数据库。
conda env create -f environment.yaml -n phagcn2
# 准备数据库
cd database
tar -zxvf ALL_protein.tar.gz
diamond makedb --in ALL_protein.fasta -d database.dmnd
diamond blastp --sensitive -d database.dmnd -q ALL_protein.fasta -o database.self-diamond.tab
awk '$1!=$2 {{print $1,$2,$11}}' database.self-diamond.tab > database.self-diamond.tab.abc
cd .. 运行
该程序有两个参数:
示例:conda activate phagcn2
export MKL_SERVICE_FORCE_INTEL=1 #要设置一下这个
python run_Speed_up.py --contigs contigs.fa --len 8000 注意,该程序没有指定输出路程,会在当前目录生成,每次重跑会覆盖之前的内容😂,而且因为它的环境路径也不是绝对的,所以只能在PhaGCN2.0目录下跑,所以也不能同时跑多个任务,作者暂时也没有修改这个问题:#10。 所以最好在运行该程序之前先切换到相应的输出目录(把运行文件全部拷贝到该目录),或者每次运行完把结果移动到输出目录。 每次运行完把结果再移动到输出目录的策略还是有不足,会导致无法同时运行多个任务,所以选择前者。 我们可以看一下run_Speed_up.py具体的几个步骤:
所以还是选择自己写一个pipeline来跑吧,方便指定输出位置以及同时跑多个任务,不用run_Speed_up.py: #!/bin/bash
# 打印帮助信息
usage() {
echo "Usage: $0 -p <phaGCN_dir> -i <input_file> -o <output_dir>"
exit 1
}
# 解析命令行参数
while getopts ":p:i:o:" opt; do
case "${opt}" in
p)
phaGCN_dir=${OPTARG}
;;
i)
input=${OPTARG}
;;
o)
output=${OPTARG}
;;
*)
usage
;;
esac
done
# 检查是否提供了所有参数
if [ -z "${phaGCN_dir}" ] || [ -z "${input}" ] || [ -z "${output}" ]; then
usage
fi
# 将路径转换为绝对路径
phaGCN_dir=$(cd "$(dirname "$phaGCN_dir")" && pwd)/$(basename "$phaGCN_dir")
input=$(cd "$(dirname "$input")" && pwd)/$(basename "$input")
output=$(cd "$(dirname "$output")" && pwd)/$(basename "$output")
# 检查输出目录是否存在且非空
if [ -d "$output" ] && [ "$(ls -A $output)" ]; then
echo "Error: Output directory $output already exists and is not empty."
exit 1
fi
# 创建输出目录并进入
mkdir -p "$output"
cd "$output" || exit
# 复制 Python 脚本和 C 相关内容
cp "${phaGCN_dir}"/*.py ./
cp -r "${phaGCN_dir}/C"* ./
# 创建 database 目录的符号链接
ln -s "${phaGCN_dir}/database/" ./
# 创建 input 目录并复制输入文件
mkdir input/
cp "$input" input/
# 运行各个 Python 脚本
echo "Running CNN..."
python run_CNN.py
echo "Running KnowledgeGraph..."
mkdir network
python run_KnowledgeGraph.py
echo "Running GCN..."
python run_GCN.py
echo "All tasks completed."
# 删除拷贝过来的脚本和目录
rm -rf *.py C* database # 把上面内容拷贝到名为 run_phagcn的文件,注意修改~/biosoft/PhaGCN2.0为自己目录
vi run_phagcn
chmod +x run_phagcn
# 链接到环境变量目录
ln -s ~/biosoft/PhaGCN2.0/run_phagcn ~/miniconda3/envs/phagcn2/bin/ 这样就可以在任意目录运行该程序并指定输出了。 |
I'm sorry for the late reply, and I sincerely appreciate your suggestions and feedback. The specification of the output directory and the update of the new database are currently in progress, and we expect to release the updated version of PhagCN2 within one to two weeks. In the meantime, if you need to run multiple parallel processes, you can do so by duplicating the folder. Thank you for your patience. |
This function has been added to the latest 2.3, if you have additional questions, please feel free to contact us. |
Can't I choose the output file location? Every time I run it, it overwrites the previous result.
The text was updated successfully, but these errors were encountered: