Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative to a fluent API #16

Closed
dscape opened this issue Oct 23, 2020 · 4 comments
Closed

Alternative to a fluent API #16

dscape opened this issue Oct 23, 2020 · 4 comments

Comments

@dscape
Copy link

dscape commented Oct 23, 2020

Hi,

I'm trying to derive many keys with different params:

table
  .params({date: 'Jan'})
  .derive({Jan: (e,$) => concat($.date, e) })
  .params({date: 'Feb'})
  .derive({Feb: (e,$) => concat($.date, e) })

However the months can be 3, 6, or 12. So I need to calculate the derive function using a object that looks like

params = [{date: 'Jan'}];
derive = { 
  Jan : f()...,
  Feb: f()...,
};

Without a fluent api I can imagine doing something like

table.
  notfluent([
    {params: params},
    {derive: derive}
  ])

Is there a way to achieve this in Arquero? Alternatively I'll have to append the array to the existing Arquero table, do a spread, and rename the columns? Is that the solution that is advisable? This means I would do the calculations outside arquero, which might defeat the point?

@jheer
Copy link
Member

jheer commented Oct 23, 2020

I don’t think I fully understand what you are trying to accomplish and why you need params to do it. Can you provide a bit more detail about your task, the input, and the desired output?

@dscape
Copy link
Author

dscape commented Oct 23, 2020

Hi @jheer - first thanks for this. It's amazing, I was thinking of building a backend in python to use numpy but this really made prototyping some ideas so much quicker.

I made this observable notebook for you to take a peek, perhaps its clearer? I know how to do the JS bits, was just wondering what how to achieve this in Arquero?

https://observablehq.com/@dscape/alternative-to-a-fluent-api-16

@jheer
Copy link
Member

jheer commented Oct 26, 2020

I'm still not entirely sure what transformation logic you have in mind, but here is a guess.

staffAq
  .unroll({ month: d => op.sequence(0, 12) })
  .derive({ monthly: d => op.month(d.start) <= d.month ? d.monthly : 0 })
  .params({ names: ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'] })
  .derive({ month: (d, $) => $.names[d.month] })
  .groupby('name')
  .pivot('month', 'monthly', { sort: false })
  .select('name', 'Jan', 'Jul', 'Dec')
  .view()

The steps involved are:

  1. Add a new array from 0...11 for each month, and unroll it so we get one row per month.
  2. Rewrite the monthly column based on the start date.
  3. Bind a parameter: an array of month names to use as a lookup table
  4. Replace the month indices with month names
  5. Group-by the name column to prepare for the next step
  6. Pivot the table so that we map values for each month (per-row) to a new columns, one per month
  7. (Optionally) Add a select to map the output for your example. If you instead want to filter by some logic, you might include a filter before the pivot, as you see fit.

@jheer jheer closed this as completed Oct 26, 2020
@dscape
Copy link
Author

dscape commented Oct 28, 2020

Thank you, this is excellent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants