# More on Intentions and Column Expressions

In [1]:
import pandas as pd
from dfply import *
import matplotlib.pylab as plt
%matplotlib inline

In [2]:
pd.set_option("display.max_columns", None)

In [3]:
artist_url = "https://github.com/MuseumofModernArt/collection/raw/master/Artists.csv"
artists = pd.read_csv(artist_url)
artists.head()

Unnamed: 0,ConstituentID,DisplayName,ArtistBio,Nationality,Gender,BeginDate,EndDate,Wiki QID,ULAN
0,1,Robert Arneson,"American, 1930–1992",American,Male,1930,1992,,
1,2,Doroteo Arnaiz,"Spanish, born 1936",Spanish,Male,1936,0,,
2,3,Bill Arnold,"American, born 1941",American,Male,1941,0,,
3,4,Charles Arnoldi,"American, born 1946",American,Male,1946,0,Q1063584,500027998.0
4,5,Per Arnoldi,"Danish, born 1941",Danish,Male,1941,0,,


In [4]:
artwork_url = "https://github.com/MuseumofModernArt/collection/raw/master/Artworks.csv"
artwork = pd.read_csv(artwork_url) # Big file, be patient
artwork.head()

Unnamed: 0,Title,Artist,ConstituentID,ArtistBio,Nationality,BeginDate,EndDate,Gender,Date,Medium,Dimensions,CreditLine,AccessionNumber,Classification,Department,DateAcquired,Cataloged,ObjectID,URL,ThumbnailURL,Circumference (cm),Depth (cm),Diameter (cm),Height (cm),Length (cm),Weight (kg),Width (cm),Seat Height (cm),Duration (sec.)
0,"Ferdinandsbrücke Project, Vienna, Austria (Ele...",Otto Wagner,6210,"(Austrian, 1841–1918)",(Austrian),(1841),(1918),(Male),1896,Ink and cut-and-pasted painted pages on paper,"19 1/8 x 66 1/2"" (48.6 x 168.9 cm)",Fractional and promised gift of Jo Carole and ...,885.1996,Architecture,Architecture & Design,1996-04-09,Y,2,http://www.moma.org/collection/works/2,http://www.moma.org/media/W1siZiIsIjU5NDA1Il0s...,,,,48.6,,,168.9,,
1,"City of Music, National Superior Conservatory ...",Christian de Portzamparc,7470,"(French, born 1944)",(French),(1944),(0),(Male),1987,Paint and colored pencil on print,"16 x 11 3/4"" (40.6 x 29.8 cm)",Gift of the architect in honor of Lily Auchinc...,1.1995,Architecture,Architecture & Design,1995-01-17,Y,3,http://www.moma.org/collection/works/3,http://www.moma.org/media/W1siZiIsIjk3Il0sWyJw...,,,,40.6401,,,29.8451,,
2,"Villa near Vienna Project, Outside Vienna, Aus...",Emil Hoppe,7605,"(Austrian, 1876–1957)",(Austrian),(1876),(1957),(Male),1903,"Graphite, pen, color pencil, ink, and gouache ...","13 1/2 x 12 1/2"" (34.3 x 31.8 cm)",Gift of Jo Carole and Ronald S. Lauder,1.1997,Architecture,Architecture & Design,1997-01-15,Y,4,http://www.moma.org/collection/works/4,http://www.moma.org/media/W1siZiIsIjk4Il0sWyJw...,,,,34.3,,,31.8,,
3,"The Manhattan Transcripts Project, New York, N...",Bernard Tschumi,7056,"(French and Swiss, born Switzerland 1944)",(),(1944),(0),(Male),1980,Photographic reproduction with colored synthet...,"20 x 20"" (50.8 x 50.8 cm)",Purchase and partial gift of the architect in ...,2.1995,Architecture,Architecture & Design,1995-01-17,Y,5,http://www.moma.org/collection/works/5,http://www.moma.org/media/W1siZiIsIjEyNCJdLFsi...,,,,50.8,,,50.8,,
4,"Villa, project, outside Vienna, Austria, Exter...",Emil Hoppe,7605,"(Austrian, 1876–1957)",(Austrian),(1876),(1957),(Male),1903,"Graphite, color pencil, ink, and gouache on tr...","15 1/8 x 7 1/2"" (38.4 x 19.1 cm)",Gift of Jo Carole and Ronald S. Lauder,2.1997,Architecture,Architecture & Design,1997-01-15,Y,6,http://www.moma.org/collection/works/6,http://www.moma.org/media/W1siZiIsIjEyNiJdLFsi...,,,,38.4,,,19.1,,


## `X` is an `Intention`

<img src="img/dfply_X_intention_1.png" width = 800>

Think of it as recording an expression for later evaluation

In [5]:
expr = X.BeginDate.head()
expr

<dfply.base.Intention at 0x7f0330292920>

## Use `evaluate` to apply the expression

We can apply an expression *later* using `evaluate` with a dataframe.

In [6]:
expr.evaluate(artists)

0    1930
1    1936
2    1941
3    1946
4    1941
Name: BeginDate, dtype: int64

## Intention expressions are reusable!

In [7]:
# Reusable!
expr.evaluate(artwork)

0    (1841)
1    (1944)
2    (1876)
3    (1944)
4    (1876)
Name: BeginDate, dtype: object

## <font color="red"> Exercise 2.3.1 </font>
    
**Tasks:**

1. Create and evaluate a column expression, saved as `my_expr`, that checks that the `Height` column is larger than 40. **Hint:** The space and `()` in the column name requires you to use `X['col name']` format.
2. Evaluate that column expression to the `Artist` and `Artwork` data frame, that is evaluate `my_expr.evaluate(df)`.
3. Use the expression object in filter, e.g. `filter(my_expr)` on the `Artwork` data set 
4. Now try to perform the filter with a pipe and no expression, e.g. `filter_by(artwork['Height cm' > 40)`.  Why does this still work?  When might we run into trouble?
5. Write a paragraph that summarizes how this all works.

In [8]:
# Code for Task 1
my_expr = X["Height (cm)"] > 40

In [9]:
# Code for Task 2
my_expr.evaluate(artists) # KeyError: 'Height (cm)' - artists does not have a "Height (cm)" column

KeyError: 'Height (cm)'

In [10]:
my_expr.evaluate(artwork)

0          True
1          True
2         False
3          True
4         False
          ...  
139932    False
139933    False
139934    False
139935    False
139936    False
Name: Height (cm), Length: 139937, dtype: bool

In [11]:
# Code for Task 3?
# artwork.filter is a real function, but not like this. df.filter is basically select
(artwork
    >> filter_by(my_expr)
)

Unnamed: 0,Title,Artist,ConstituentID,ArtistBio,Nationality,BeginDate,EndDate,Gender,Date,Medium,Dimensions,CreditLine,AccessionNumber,Classification,Department,DateAcquired,Cataloged,ObjectID,URL,ThumbnailURL,Circumference (cm),Depth (cm),Diameter (cm),Height (cm),Length (cm),Weight (kg),Width (cm),Seat Height (cm),Duration (sec.)
0,"Ferdinandsbrücke Project, Vienna, Austria (Ele...",Otto Wagner,6210,"(Austrian, 1841–1918)",(Austrian),(1841),(1918),(Male),1896,Ink and cut-and-pasted painted pages on paper,"19 1/8 x 66 1/2"" (48.6 x 168.9 cm)",Fractional and promised gift of Jo Carole and ...,885.1996,Architecture,Architecture & Design,1996-04-09,Y,2,http://www.moma.org/collection/works/2,http://www.moma.org/media/W1siZiIsIjU5NDA1Il0s...,,,,48.600000,,,168.900000,,
1,"City of Music, National Superior Conservatory ...",Christian de Portzamparc,7470,"(French, born 1944)",(French),(1944),(0),(Male),1987,Paint and colored pencil on print,"16 x 11 3/4"" (40.6 x 29.8 cm)",Gift of the architect in honor of Lily Auchinc...,1.1995,Architecture,Architecture & Design,1995-01-17,Y,3,http://www.moma.org/collection/works/3,http://www.moma.org/media/W1siZiIsIjk3Il0sWyJw...,,,,40.640100,,,29.845100,,
3,"The Manhattan Transcripts Project, New York, N...",Bernard Tschumi,7056,"(French and Swiss, born Switzerland 1944)",(),(1944),(0),(Male),1980,Photographic reproduction with colored synthet...,"20 x 20"" (50.8 x 50.8 cm)",Purchase and partial gift of the architect in ...,2.1995,Architecture,Architecture & Design,1995-01-17,Y,5,http://www.moma.org/collection/works/5,http://www.moma.org/media/W1siZiIsIjEyNCJdLFsi...,,,,50.800000,,,50.800000,,
30,"Memorial to the Six Million Jewish Martyrs, pr...",Louis I. Kahn,2964,"(American, born Estonia. 1901–1974)",(American),(1901),(1974),(Male),1968,Charcoal and graphite on tracing paper,"44 1/2 x 66"" (113 x 167.6 cm)",Purchase,3.1997,Architecture,Architecture & Design,1997-01-15,Y,32,http://www.moma.org/collection/works/32,http://www.moma.org/media/W1siZiIsIjE3MyJdLFsi...,,,,113.000000,,,167.600000,,
31,"The Manhattan Transcripts Project, New York, N...",Bernard Tschumi,7056,"(French and Swiss, born Switzerland 1944)",(),(1944),(0),(Male),1980,Photographic reproduction with colored synthet...,"20 x 20"" (50.8 X 50.8 cm)",Purchase and partial gift of the architect in ...,4.1995,Architecture,Architecture & Design,1995-01-17,Y,33,http://www.moma.org/collection/works/33,http://www.moma.org/media/W1siZiIsIjIwMCJdLFsi...,,,,50.800000,,,50.800000,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
139882,"Untitled from Two Buttocks, Rubbing Together, ...",Obiora Udechukwu,134108,"(Nigerian, born 1946)",(Nigerian),(1946),(0),(Male),2006,One from a series of twenty-four drawings in i...,"30 × 22"" (76.2 × 55.9 cm)",Acquired through the generosity of the Contemp...,383.2021.22,Drawing,Drawings & Prints,2021-10-26,Y,435464,http://www.moma.org/collection/works/435464,,,,,76.200152,,,55.880112,,
139883,"Untitled from Two Buttocks, Rubbing Together, ...",Obiora Udechukwu,134108,"(Nigerian, born 1946)",(Nigerian),(1946),(0),(Male),2006,One from a series of twenty-four drawings in i...,"30 × 22"" (76.2 × 55.9 cm)",Acquired through the generosity of the Contemp...,383.2021.23,Drawing,Drawings & Prints,2021-10-26,Y,435465,http://www.moma.org/collection/works/435465,,,,,76.200152,,,55.880112,,
139884,"Untitled from Two Buttocks, Rubbing Together, ...",Obiora Udechukwu,134108,"(Nigerian, born 1946)",(Nigerian),(1946),(0),(Male),2006,One from a series of twenty-four drawings in i...,"30 × 22"" (76.2 × 55.9 cm)",Acquired through the generosity of the Contemp...,383.2021.24,Drawing,Drawings & Prints,2021-10-26,Y,435466,http://www.moma.org/collection/works/435466,,,,,76.200152,,,55.880112,,
139886,"Barcelona Exhibition, German Section, Transpor...","Ludwig Mies van der Rohe, Lilly Reich","7166, 8059","(American, born Germany. 1886–1969) (German, 1...",(American) (German),(1886) (1885),(1969) (1947),(Male) (Female),1929,Print,"17 1/4 × 33 1/2"" (43.8 × 85.1 cm)","Mies van der Rohe Archive, gift of the archite...",MR15.6,Mies van der Rohe Archive,Architecture & Design,,Y,436051,http://www.moma.org/collection/works/436051,,,,,43.815088,,,85.090170,,


In [12]:
artwork.filter(my_expr)

TypeError: iter() returned non-iterator of type 'Intention'

In [13]:
# Code for task 4, I think
(artwork
    >> filter_by(artwork["Height (cm)"] > 40)
)

Unnamed: 0,Title,Artist,ConstituentID,ArtistBio,Nationality,BeginDate,EndDate,Gender,Date,Medium,Dimensions,CreditLine,AccessionNumber,Classification,Department,DateAcquired,Cataloged,ObjectID,URL,ThumbnailURL,Circumference (cm),Depth (cm),Diameter (cm),Height (cm),Length (cm),Weight (kg),Width (cm),Seat Height (cm),Duration (sec.)
0,"Ferdinandsbrücke Project, Vienna, Austria (Ele...",Otto Wagner,6210,"(Austrian, 1841–1918)",(Austrian),(1841),(1918),(Male),1896,Ink and cut-and-pasted painted pages on paper,"19 1/8 x 66 1/2"" (48.6 x 168.9 cm)",Fractional and promised gift of Jo Carole and ...,885.1996,Architecture,Architecture & Design,1996-04-09,Y,2,http://www.moma.org/collection/works/2,http://www.moma.org/media/W1siZiIsIjU5NDA1Il0s...,,,,48.600000,,,168.900000,,
1,"City of Music, National Superior Conservatory ...",Christian de Portzamparc,7470,"(French, born 1944)",(French),(1944),(0),(Male),1987,Paint and colored pencil on print,"16 x 11 3/4"" (40.6 x 29.8 cm)",Gift of the architect in honor of Lily Auchinc...,1.1995,Architecture,Architecture & Design,1995-01-17,Y,3,http://www.moma.org/collection/works/3,http://www.moma.org/media/W1siZiIsIjk3Il0sWyJw...,,,,40.640100,,,29.845100,,
3,"The Manhattan Transcripts Project, New York, N...",Bernard Tschumi,7056,"(French and Swiss, born Switzerland 1944)",(),(1944),(0),(Male),1980,Photographic reproduction with colored synthet...,"20 x 20"" (50.8 x 50.8 cm)",Purchase and partial gift of the architect in ...,2.1995,Architecture,Architecture & Design,1995-01-17,Y,5,http://www.moma.org/collection/works/5,http://www.moma.org/media/W1siZiIsIjEyNCJdLFsi...,,,,50.800000,,,50.800000,,
30,"Memorial to the Six Million Jewish Martyrs, pr...",Louis I. Kahn,2964,"(American, born Estonia. 1901–1974)",(American),(1901),(1974),(Male),1968,Charcoal and graphite on tracing paper,"44 1/2 x 66"" (113 x 167.6 cm)",Purchase,3.1997,Architecture,Architecture & Design,1997-01-15,Y,32,http://www.moma.org/collection/works/32,http://www.moma.org/media/W1siZiIsIjE3MyJdLFsi...,,,,113.000000,,,167.600000,,
31,"The Manhattan Transcripts Project, New York, N...",Bernard Tschumi,7056,"(French and Swiss, born Switzerland 1944)",(),(1944),(0),(Male),1980,Photographic reproduction with colored synthet...,"20 x 20"" (50.8 X 50.8 cm)",Purchase and partial gift of the architect in ...,4.1995,Architecture,Architecture & Design,1995-01-17,Y,33,http://www.moma.org/collection/works/33,http://www.moma.org/media/W1siZiIsIjIwMCJdLFsi...,,,,50.800000,,,50.800000,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
139882,"Untitled from Two Buttocks, Rubbing Together, ...",Obiora Udechukwu,134108,"(Nigerian, born 1946)",(Nigerian),(1946),(0),(Male),2006,One from a series of twenty-four drawings in i...,"30 × 22"" (76.2 × 55.9 cm)",Acquired through the generosity of the Contemp...,383.2021.22,Drawing,Drawings & Prints,2021-10-26,Y,435464,http://www.moma.org/collection/works/435464,,,,,76.200152,,,55.880112,,
139883,"Untitled from Two Buttocks, Rubbing Together, ...",Obiora Udechukwu,134108,"(Nigerian, born 1946)",(Nigerian),(1946),(0),(Male),2006,One from a series of twenty-four drawings in i...,"30 × 22"" (76.2 × 55.9 cm)",Acquired through the generosity of the Contemp...,383.2021.23,Drawing,Drawings & Prints,2021-10-26,Y,435465,http://www.moma.org/collection/works/435465,,,,,76.200152,,,55.880112,,
139884,"Untitled from Two Buttocks, Rubbing Together, ...",Obiora Udechukwu,134108,"(Nigerian, born 1946)",(Nigerian),(1946),(0),(Male),2006,One from a series of twenty-four drawings in i...,"30 × 22"" (76.2 × 55.9 cm)",Acquired through the generosity of the Contemp...,383.2021.24,Drawing,Drawings & Prints,2021-10-26,Y,435466,http://www.moma.org/collection/works/435466,,,,,76.200152,,,55.880112,,
139886,"Barcelona Exhibition, German Section, Transpor...","Ludwig Mies van der Rohe, Lilly Reich","7166, 8059","(American, born Germany. 1886–1969) (German, 1...",(American) (German),(1886) (1885),(1969) (1947),(Male) (Female),1929,Print,"17 1/4 × 33 1/2"" (43.8 × 85.1 cm)","Mies van der Rohe Archive, gift of the archite...",MR15.6,Mies van der Rohe Archive,Architecture & Design,,Y,436051,http://www.moma.org/collection/works/436051,,,,,43.815088,,,85.090170,,


In [14]:
(artists
    >> filter_by(artwork["Height (cm)"] > 40)) # ValueError: Item wrong length 139937 instead of 15250.

ValueError: Item wrong length 139937 instead of 15250.

> Maybe I don't understand the question, but I don't know why it wouldn't work? The same thing happens either way - when you get to filter_by, it either evaluates the saved expression using the piped in data set, resulting in a Series of boolean values that then are used to select the rows, or you just create the Series the old-fashioned way (that's basically how you'd do it in basic pandas). You would want to be careful that the fields that you are using match. As we see, putting the full expression creates a list of booleans, so if somehow you weren't using the same dataframe in each, it would fail (or silently pass if they happened to be the same size). Or if you used the same dataframe but were filtering on a column that you weren't selecting, it might result in output rows different from what was imagined, though that could just as well happen with normal filter operations.

## Not just for data frames ... works for any* expression

In [15]:
double, line = 2*X, 3*X + 5

In [16]:
double.evaluate(3), line.evaluate(6)

(6, 23)

## We can make functions lazy too!

Decorate a function with `make_symbolic` to allow lazy evaluation of `Intention` objects

In [17]:
from math import log
log = make_symbolic(log)

In [18]:
log(8, 2) # Works as expected with numbers

3.0

## Passing in `X` now makes a `log` expression

In [19]:
expr = log(X, 2) # Passing in X makes it lazy/symbolic
expr

<dfply.base.Intention at 0x7f032a28f1c0>

In [20]:
expr.evaluate(16) # Evaluate later

4.0