Formula should include : and * interactions #18

HarlanH · 2012-07-15T15:29:15Z

No description provided.

doobwa · 2012-07-19T06:53:50Z

I'm curious about how to go about this. In the following it seems that + precedes : in the order of operations for Expr objects (which of course is incorrect for the model notation).

julia> f = Formula(:(y ~ x1 + x1:x2))
Formula([y],[:(+(x1,x1),x2)])

julia> f.rhs[1].args
2-element Any Array:
 +(x1,x1)
 x2

Doesn't this make it harder to use the : notation without changing Expr objects?

tshort · 2012-07-19T11:18:59Z

Maybe we'll have to change operators. :: looks like it might work. So would & and %. Here's a list of operators ordered by precedence from julia-parser.scm:

(define ops-by-prec
  '#((= := += -= *= /= //= .//= .*= ./= |\\=| |.\\=| ^= .^= %= |\|=| &= $= => <<= >>= >>>= ~ |.+=| |.-=|)
     (?)
     (|\|\||)
     (&&)
     ; note: there are some strange-looking things in here because
     ; the way the lexer works, every prefix of an operator must also
     ; be an operator.
     (<- -- -->)
     (> < >= <= == === != |.>| |.<| |.>=| |.<=| |.==| |.!=| |.=| |.!| |<:| |>:|)
     (: |..|)
     (+ - |.+| |.-| |\|| $)
     (<< >> >>>)
     (* / |./| % & |.*| |\\| |.\\|)
     (// .//)
     (^ |.^|)
     (|::|)
     (|.|)))

HarlanH · 2012-07-19T11:36:42Z

(Tom, think you hit the close button by mistake! A bit of a GitHub UI quirk...)

I concur. I think we should go with & instead of :. y ~ 1 + x + x&y. There are also those redundant formula features I never use, like subtracting a predictor: y ~ 1 + x * y - y and whatnot. I don't really care if we support those or not. I'd prefer we stick with 0+ to remove the interaction term too, and not support - 1, which I find harder to read.

doobwa · 2012-07-19T17:14:14Z

There is something to be said for supporting R's syntax: it's been around long enough for people to be familiar with it, and the Python people are starting to use it as well. Would this be possible if we instead parsed strings? As soon as I said that, though, it doesn't seem worth it.

On the other hand, the number of operations we're talking about is pretty minimal, so people will just need to look up Julia's way of doing it. One direction I think would be cool: extend this notation to also include namespaces of features a la Vowpal Wabbit's sparse format. For example, if you have a sparse, bag-of-words representation for a text document, all of these features could be under the words namespace. If you also have a categorical variable for day of week, all y ~ words * day would create interaction terms between all the word features and the day feature.

HarlanH · 2012-07-19T17:25:58Z

Yeah, I don't think a single-character change is a big deal here, and using Julia's parser seems a big enough win that I think we should stick with it.

As for namespaces (cool -- I need to actually try VW out sometime!), we'd need a way to define them separate from the formula. Would we want to include something like "colname groups" in the DataFrame? So, you'd somehow define "dims" to be a colname group for "height", "width", and "depth", then you could use "dims" instead of a list of those three column names? That could be useful for other things too. df["dims"] becomes a shorthand for df[["height", "width", "depth"]], and df["predictors"] and df["response"] seem natural things to define, too. So you could then call lm(:(response ~ predictors + covariants), df) or something. That's fairly awesome. I'm going to spin off an issue!

closes #16

tshort closed this as completed Jul 19, 2012

HarlanH reopened this Jul 19, 2012

HarlanH mentioned this issue Jul 19, 2012

column groups #36

Closed

doobwa mentioned this issue Jul 23, 2012

Formulas #41

Merged

tshort closed this as completed Aug 4, 2012

nalimilan pushed a commit that referenced this issue Jul 8, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

40a3472

nalimilan pushed a commit that referenced this issue Jul 8, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

12d755e

nalimilan pushed a commit that referenced this issue Jul 8, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

9c4122b

nalimilan pushed a commit that referenced this issue Jul 8, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

3e55508

nalimilan pushed a commit that referenced this issue Jul 8, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

17215da

rofinn pushed a commit that referenced this issue Aug 17, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

30d7cfd

nalimilan pushed a commit that referenced this issue Aug 25, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

a78a82f

quinnj pushed a commit that referenced this issue Sep 2, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

a08bccc

quinnj pushed a commit that referenced this issue Sep 2, 2017

Remove duplicate functionality of base.vcat for void vectors (#18)

c5e3a5b

nalimilan pushed a commit that referenced this issue May 26, 2022

Add NEWS.md (#18)

1d40335

closes #16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formula should include : and * interactions #18

Formula should include : and * interactions #18

HarlanH commented Jul 15, 2012

doobwa commented Jul 19, 2012

tshort commented Jul 19, 2012

HarlanH commented Jul 19, 2012

doobwa commented Jul 19, 2012

HarlanH commented Jul 19, 2012

Formula should include : and * interactions #18

Formula should include : and * interactions #18

Comments

HarlanH commented Jul 15, 2012

doobwa commented Jul 19, 2012

tshort commented Jul 19, 2012

HarlanH commented Jul 19, 2012

doobwa commented Jul 19, 2012

HarlanH commented Jul 19, 2012