# Command spliter
This utility module is used to split a series of sos command into different genes to accomodate the need of AWS.
## Input
1. command_file: A text file containing variouse sos command in the correct order. The output of command generator for 1 analysis is a suitable input as long as the order of operation are correct. It is assums that all the input sos command has a parameter called --region-name and the analysis can be partitioned by this parameter.
2. region_list: A table where the last column is the name of the region to be split analysis.
3. s3_path: The path of the file on the AWS server
4. virtual_machine_path: The path of the file and working directories in the to be established virtual machine.
5. wildcard_file (WIP, please ignore)
6. output_file: The surfix of the output script.
## Output
1. all the sos run command in the command_file, in the forms of for each regions in the region list. Each row is a region.
   
    `copy {s3 path} {vm path} && ({command 1} --region-name {gene_name} && {command 2} --region-name {gene_name} ) || copy {vm path} {s3 path}`## MWE




   

In [None]:
sos run command_spliter.ipynb split_command --command_file test_command --s3_path "working/" \
    --virtual_machine_path "analysis/" \
    --region_list  test_command_region_list

In [None]:
[global]
parameter: command_file = path
parameter: wildcard_file = path("")
parameter: s3_path = str
parameter: virtual_machine_path = str
parameter: region_list = path
parameter: cwd = path(".")
parameter: output_file = path("splited_cripts.sh")


[split_command]
input: command_file
output: f'{cwd}/{_input:bn}.splited_script'
import pandas as pd

region = pd.read_csv(region_list,sep = "\t").iloc[:,-1].tolist()

# Step 1: Read the commands from a text file
commands = []
with open(_input, 'r') as file:
    command = ""
    for line in file:
        stripped_line = line.strip()
        if stripped_line.startswith('sos run'):
            if command:
                commands.append(command)
            command = stripped_line
        elif command.endswith('\\'):
            command = command[:-1].strip() + ' ' + stripped_line
    if command:
        commands.append(command)

# Step 2: Create a DataFrame with region names appended to each command
data = {command: [f"{command} --region-name {r}" for r in region] for command in commands}
df_commands = pd.DataFrame(data, index=region)

# Step 3: Creating the concatenated list
concatenated_commands = [" && ".join(row) for _, row in df_commands.iterrows()]

# Define your paths for the copy command
path1 = s3_path
path2 = virtual_machine_path

# Step 4: Modifying each command
modified_commands = [f"copy {path1} {path2} && ( {cmd} ) || copy {path2} {path1}" for cmd in concatenated_commands]

output_file = f'{modified_commands}.sh'
with open(_output, 'w') as f:
    for cmd in modified_commands:
        f.write(cmd + '\n')
