Skip to content

generated output DAG json strcuture not similar to workflowhub #17

@arabnejad

Description

@arabnejad

Hi
previously, I used workflowhub to generate some realworld workflow application, now it seems it merged with wfcommons package
I did some simple tests and the generated output json file is not clear
let me to explain this with a simple example, for EpigenomicsRecipe, we have

        "jobs": [
            {
                "name": "fastqSplit_00000001",
                "type": "compute",
                "runtime": 878.473,
                "parents": [],
                "children": [
                    "filterContams_00000002",
                    "filterContams_00000006",
                    "filterContams_00000010"
                ],
                "files": [
                    {
                        "link": "input",
                        "name": "06252281-89da-4385-b6cd-025b55f91d56.sfq",
                        "size": 57233202
                    },
                    {
                        "link": "output",
                        "name": "314ac45e-0b2b-447d-b7b3-e44806bcd60a.sfq",
                        "size": 12060453
                    },
                    {
                        "link": "output",
                        "name": "03c46ee5-7d81-48e8-b738-6a52a3f02044.sfq",
                        "size": 10733270
                    },
                    {
                        "link": "output",
                        "name": "3f8c84bf-2c61-4a30-a891-4aeda1de6fd2.sfq",
                        "size": 12346046
                    }
                ],
                "cores": 1
            },
            ...
            ...
            {
                "name": "filterContams_00000002",
                "type": "compute",
                "runtime": 12.196,
                "parents": [
                    "fastqSplit_00000001"
                ],
                "children": [
                    "sol2sanger_00000003"
                ],
                "files": [
                    {
                        "link": "input",
                        "name": "314ac45e-0b2b-447d-b7b3-e44806bcd60a.sfq",
                        "size": 12060453
                    },
                    {
                        "link": "output",
                        "name": "2b441ab8-e098-46d1-834f-dc11513ee8ec.sfq",
                        "size": 2747304
                    }
                ],
                "cores": 1
            },

as it can be seen, file 314ac45e-0b2b-447d-b7b3-e44806bcd60a.sfq is marked as the output file of task fastqSplit_00000001 and the input file for task sol2sanger_00000003

however, by using wfcommons we don't have this structure
for example,

            {
                "name": "fastqSplit_00000021",
                "type": "compute",
                "runtime": 878.473,
                "parents": [],
                "children": [
                    "filterContams_00000022",
                    "filterContams_00000023",
                    "filterContams_00000024",
                    "filterContams_00000025",
                    "filterContams_00000026",
                    "filterContams_00000027",
                    "filterContams_00000028",
                    "filterContams_00000029",
                    "filterContams_00000030"
                ],
                "files": [
                    {
                        "link": "input",
                        "name": "a22a4e96-1955-4395-8049-aad709e7e2c0.sfq",
                        "size": 367561779
                    },
                    {
                        "link": "output",
                        "name": "5c162d9d-72b6-443b-982b-4c503cbafa0a.sfq",
                        "size": 11562952
                    }
                ],
                "cores": 1
            },
            ...
            ...
            {
                "name": "filterContams_00000022",
                "type": "compute",
                "runtime": 40.919,
                "parents": [
                    "fastqSplit_00000021"
                ],
                "children": [
                    "sol2sanger_00000004"
                ],
                "files": [
                    {
                        "link": "input",
                        "name": "f1fa831a-c037-4f0d-b468-87cecff9004d.sfq",
                        "size": 5489748
                    },
                    {
                        "link": "output",
                        "name": "4c0098e8-ea3b-435b-967d-cbe3d6a5c06e.sfq",
                        "size": 1360589
                    }
                ],
                "cores": 1
            },

as you can see, for task fastqSplit_00000021, child filterContams_00000022 has input file f1fa831a-c037-4f0d-b468-87cecff9004d.sfq but it did not listed as the output file of task fastqSplit_00000021

is this a bug @tainagdcoleman ? I think, similar to workflowhub, the file name should be listed twice, once for parent as output file and once as input file for the child

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions