New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feedback wanted] api for pipe/parallel/serial mode #7

Open
maxogden opened this Issue Aug 6, 2014 · 7 comments

Comments

Projects
None yet
6 participants
@maxogden
Member

maxogden commented Aug 6, 2014

consider this use case:

https://gist.github.com/maxogden/80de2ba6a6f52ff382e3

the nulls are currently the only way to tell gasket to run the pipeline one at a time (serially). if the nulls are removed then all of the gasket run import -- lines would be spawned at once, which technically works but causes my computer to almost die

so what would be a better api for disabling the auto pipe mode?

ideas:

1: make the main pipeline an object instead of an array and add an option to change behavior, e.g.:

{
  "gasket": {
    "main": {
      "commands": [
        "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
        "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-2.xml"
      ],
      "serial": true
    }
  }
}

instead of "serial": true it could be "parallel": false or "pipe": false

2: make "pipe": false by default. then you could just do this:

{
  "gasket": {
    "main": [
      "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
      "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-2.xml"
    ]
  }
}

and they would spawned/run one at a time and not get piped to each other. to get them to pipe together you would have to use the syntax from option 1

3: have 2 top level default keys for 'parallel' and 'serial' commands

{
  "gasket": {
    "pipes": [
      "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
      "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-2.xml"
    ],
    "serial": [
      "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
      "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-2.xml" 
    ]
  }
}

e.g. in the above doing gasket run pipe would act differently from gasket run serial (this one might be too magic). also i don't like the names serial and pipes that much

thoughts?

@jacquestardie

This comment has been minimized.

Show comment
Hide comment
@jacquestardie

jacquestardie Aug 21, 2014

Given what gasket is intended to do, I think the first option seems best.

jacquestardie commented Aug 21, 2014

Given what gasket is intended to do, I think the first option seems best.

@maxogden

This comment has been minimized.

Show comment
Hide comment
@maxogden

maxogden Dec 29, 2014

Member

I think the goal of gasket should be to be as explicit and low level as possible... (im channeling my inner @mafintosh here)

Reading the three options above I kind of dont like any of them now. i'd prefer something like this:

{
  "gasket": {
    "main": [
        {
          "command": "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
          "type": "serial"
        },
        {
          "command": "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
          "type": "serial"
        }
      ]
    }
  }
}

e.g. where everything is explicit. that way the gasket.json becomes lower level, and we can worry about user-friendliness in areas like this

we would have to define all of the different types, e.g. https://github.com/datproject/datscript/blob/master/example-bionode.ds#L7-L11

Member

maxogden commented Dec 29, 2014

I think the goal of gasket should be to be as explicit and low level as possible... (im channeling my inner @mafintosh here)

Reading the three options above I kind of dont like any of them now. i'd prefer something like this:

{
  "gasket": {
    "main": [
        {
          "command": "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
          "type": "serial"
        },
        {
          "command": "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
          "type": "serial"
        }
      ]
    }
  }
}

e.g. where everything is explicit. that way the gasket.json becomes lower level, and we can worry about user-friendliness in areas like this

we would have to define all of the different types, e.g. https://github.com/datproject/datscript/blob/master/example-bionode.ds#L7-L11

@melaniecebula

This comment has been minimized.

Show comment
Hide comment
@melaniecebula

melaniecebula Dec 29, 2014

Contributor

I actually really prefer this to the previously mentioned approaches.

So would the different types correspond to run, then, pipe, fork? For example, if a user uses "then", the type would be "serial"? If a user uses "run", the type would be "parallel"?

Contributor

melaniecebula commented Dec 29, 2014

I actually really prefer this to the previously mentioned approaches.

So would the different types correspond to run, then, pipe, fork? For example, if a user uses "then", the type would be "serial"? If a user uses "run", the type would be "parallel"?

@mafintosh

This comment has been minimized.

Show comment
Hide comment
@mafintosh

mafintosh Dec 30, 2014

Member

@maxogden i like this. so all gasket commands are just simple non nested arrays right (no commands in commands)? and in case you need to nest them you would split them into separate gasket pipelines?

Member

mafintosh commented Dec 30, 2014

@maxogden i like this. so all gasket commands are just simple non nested arrays right (no commands in commands)? and in case you need to nest them you would split them into separate gasket pipelines?

@karissa

This comment has been minimized.

Show comment
Hide comment
@karissa

karissa Dec 30, 2014

is it more common for people to run parallel or serial jobs?

karissa commented Dec 30, 2014

is it more common for people to run parallel or serial jobs?

melaniecebula referenced this issue in melaniecebula/gasket Dec 31, 2014

Handling gasket routines (e.g. main) as an array of objects, which ea…
…ch have a command and type field (types currently supported: parallel and serial)

@melaniecebula melaniecebula referenced this issue Dec 31, 2014

Closed

Add types #16

@gtramontina

This comment has been minimized.

Show comment
Hide comment
@gtramontina

gtramontina Dec 31, 2014

Would it be too weird if we inspected the commands looking for '|' or '&&' at the end to decide whether it pipes or serializes?

gtramontina commented Dec 31, 2014

Would it be too weird if we inspected the commands looking for '|' or '&&' at the end to decide whether it pipes or serializes?

@karissa

This comment has been minimized.

Show comment
Hide comment
@karissa

karissa Jan 1, 2015

hey @gtramontina ! that could work fine. we aren't exactly sure who is going to use gasket yet. we are trying to build it as an easier abstraction than bash, and so leaving those characters out was the first idea. one complication I see is that | or && could mean "or"/"and" respectively depending on the user's background

karissa commented Jan 1, 2015

hey @gtramontina ! that could work fine. we aren't exactly sure who is going to use gasket yet. we are trying to build it as an easier abstraction than bash, and so leaving those characters out was the first idea. one complication I see is that | or && could mean "or"/"and" respectively depending on the user's background

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment