Folder Structure for Source and Munged under siegetank #6

jhprinz · 2014-10-05T17:25:42Z

I got a little lost in the typical FAH hierarchical structure/project organization:
So that is the default or preferred folder structure for fah and siegetank projects?

In the siegetank API tutorial to sync folder structure is:

target_folder/
    <stream0.id>_data/
        <part_0.frame>/
            frames.xtc
        <part_1.frame>/
        <part_2.frame>/
        ...
    <stream1.id>_data/
    <stream2.id>_data/
    ...

On top of this I fould the idea of RUNS which seem to make sense for FAH but since siegetank is very flexible maybe the meaning has changed. I assume that either

A project is in siegetank called a target
A target has several simulations called streams
Streams are sorted into RUNS

or

A project has several targets and
Each target has streams which represent simulations of the exact same .pdb / topology
Streams are sorted into RUNS that represent ??? Different stages in the setup process until the simulation goes to production?

So, I propose for the siegetank-synced (unprocessed) folder structure

<target.short_name>_<target.id>_/
    RUNS/
        RUN0_<run0.short_name>/
            STREAM0_<stream0.id>/
                <part_0.first_frame>/
                    frames.xtc
                    ...
                <part_1.first_frame>/
                <part_2.first_frame>/
                ...
            STREAM1_<stream1.id>/
            STREAM2_<stream2.id>/
    ...

where target.short_name represents the project name as in FAH projects. This way it is similar to the FAH order RUN##/CLONE##/ but contains useful extra information like the stream UUID

Then for the munged folder in /data/choderalab/fah/munged/

<target.id>_<target.short_name>/
    all-atoms/
        run0-stream0_<stream0.id>.h5
        run0-stream1_<stream1.id>.h5
        ...
        run1-stream0_<stream0.id>.h5
        ...
    no-solvent/
        run0-stream0_<stream0.id>.h5
        run0-stream1_<stream1.id>.h5
        ...
        run1-stream0_<stream0.id>.h5
        ...

We can also

remove all ids which is more compatible, but less human-readable.
exchange STREAM for CLONE to be more compatible

The text was updated successfully, but these errors were encountered:

kyleabeauchamp · 2014-10-05T17:34:16Z

I think:

Project = Target

Each pair of (run, clone) is a single stream. I don't think will be an automatic staging process on ST.

Some of these questions cannot be fully resolved until ST implements more features from FAH (E.g. points), as that will play a role in how things are set up and organized.

jhprinz · 2014-10-05T17:41:12Z

I agree:

Target should be a Project and it already contains the basic information like a description. Then simulations / stream are attached which are not organized in any way.
This means, we can (for now) impose one without interference. The problem is that internally this might become a little messy like having all files for RUNS/CLONES in one folder.

I see that this might change if the organization of ST changes.

So, for now I would keep the RUN / STREAM ordering.

What is the actual idea of RUNS in FAH? Where these meant for several iterations or for the test phase, etc?

kyleabeauchamp · 2014-10-05T17:43:32Z

In FAH, RUNS correspond to different starting conformations. CLONES refer to different velocities.

kyleabeauchamp · 2014-10-05T17:44:15Z

Also, in FAH, one is generally supposed to ensure that the different RUNS have the same number of atoms / topology / etc. Otherwise, the points will vary between the RUNS.

jhprinz · 2014-10-05T21:33:54Z

Luckily we do not have these restrictions. All streams can be totally different which means we have to be more careful staying organized.

What are points in FAH?

kyleabeauchamp · 2014-10-05T21:35:53Z

Whenever possible, we may still want to enforce these restrictions, because consistency with FAH is important.

FAH workunits award points to donors. It is the currency for doing our computations.

jhprinz · 2014-10-05T22:13:25Z

Okay "points", I thought about point like checkpoints...

Wasn't the idea of siegetank to be more flexible? It seemed quite useful, but we don't want to break compatibility. That would make more harm than good...

kyleabeauchamp · 2014-10-05T22:14:50Z

My point is that eventually siegetank is going to be plugged into FAH, so we need to adopt procedures that will be compatible with FAH operation.

jhprinz · 2014-10-05T22:20:15Z

Okay, then we should really wait, once there are more features in ST. For now I will start building something that we can use and adapt later. Changes should be easily made.

kyleabeauchamp · 2014-10-05T22:23:25Z

I agree. I was just saying that we should avoid creating excessive heterogeneity within different streams of a single target, as that's "allowed but undesirable" within the current ST API.

All I'm saying is don't use a single target to simulate both HP35 and src kinase, as that may cause issues down the road.

VijayPande · 2014-10-05T22:29:10Z

Several is plugged into fah via the latest client

Thanks,

Vijay

Sent from my phone. Sorry for any brevity or unusual tone.

On Oct 5, 2014, at 3:14 PM, kyleabeauchamp notifications@github.com wrote:

My point is that eventually siegetank is going to be plugged into FAH, so we need to adopt procedures that will be compatible with FAH operation.

—
Reply to this email directly or view it on GitHub.

jchodera · 2014-10-05T23:25:45Z

Oh! Is the latest client being rolled out already?

I think everyone in the lab is excited for how much easier it is to
programmatically set up and manage ST jobs.

VijayPande · 2014-10-06T14:53:01Z

PS THe latest client is under testing still. We can push on Joe and Yutong on that one to push it out.

Thanks,
Vijay

Sent from my Phone. Sorry for the brevity or unusual tone.

On Oct 5, 2014, at 3:29 PM, Vijay S. Pande pande@stanford.edu wrote:

Several is plugged into fah via the latest client

Thanks,

Vijay

Sent from my phone. Sorry for any brevity or unusual tone.

On Oct 5, 2014, at 3:14 PM, kyleabeauchamp notifications@github.com wrote:

My point is that eventually siegetank is going to be plugged into FAH, so we need to adopt procedures that will be compatible with FAH operation.

—
Reply to this email directly or view it on GitHub.

jhprinz mentioned this issue Oct 5, 2014

[WIP] Add Siegetank support #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Folder Structure for Source and Munged under siegetank #6

Folder Structure for Source and Munged under siegetank #6

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

VijayPande commented Oct 5, 2014

jchodera commented Oct 5, 2014

VijayPande commented Oct 6, 2014

Folder Structure for Source and Munged under siegetank #6

Folder Structure for Source and Munged under siegetank #6

Comments

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

jhprinz commented Oct 5, 2014

kyleabeauchamp commented Oct 5, 2014

VijayPande commented Oct 5, 2014

jchodera commented Oct 5, 2014

VijayPande commented Oct 6, 2014