# DataFrames in pandas
A set of examples that exhibit some of the core features of the [DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) data type in the `pandas` module.

In [23]:
import numpy as np
import pandas as pd

## Basic concept
A DataFrame is a two-dimensional tabular data struture.  It is easily visualized like a spreadsheet, with rows and columns.

In [25]:
# create a DataFrame from a dictionary containing labeled pandas Series
df = pd.DataFrame({
    'name': pd.Series( ['Foo', 'Bar', 'Baz'] ),
    'email': pd.Series( ['fo1258@foo.edu', 'br9876@foo.edu', 'bz2292@foo.edu'] ),
    'midterm exam': pd.Series( [99, 64, 87] ),
    'final exam': pd.Series( [94, 72, 81] )
})
df

Unnamed: 0,name,email,midterm exam,final exam
0,Foo,fo1258@foo.edu,99,94
1,Bar,br9876@foo.edu,64,72
2,Baz,bz2292@foo.edu,87,81


### Columns as Series
Each column is a named `pandas` Series.

In [10]:
df['midterm exam']

0    99
1    64
2    87
Name: midterm exam, dtype: int64

In [20]:
# prove that a column of a DataFrame is a Series
type( df['midterm exam'] )

pandas.core.series.Series

## Rows
Each row is also considered a `pandas` Series.

In [30]:
# get a row by its index
df.loc[1]

name                       Bar
email           br9876@foo.edu
midterm exam                64
final exam                  72
Name: 1, dtype: object

In [31]:
# prove that a row of a DataFrame is a Series
type( df.loc[1] )

pandas.core.series.Series

In [32]:
# get a row by its integer index
df.iloc[2]

name                       Baz
email           bz2292@foo.edu
midterm exam                87
final exam                  81
Name: 2, dtype: object

## Filtering rows


In [33]:
# match a criterion
df[ df['name'] == 'Bar' ]

Unnamed: 0,name,email,midterm exam,final exam
1,Bar,br9876@foo.edu,64,72


In [36]:
# match multiple criteria using & or | logic operators
df[ (df['name'] != 'Bar') & (df['midterm exam'] > 50) ]

Unnamed: 0,name,email,midterm exam,final exam
0,Foo,fo1258@foo.edu,99,94
2,Baz,bz2292@foo.edu,87,81


## Filtering columns

Extracting a **single column** is straightforward with square bracket syntax.

In [51]:
# fetch the 'name' column - this returns a Series
df['name']

0    Foo
1    Bar
2    Baz
Name: name, dtype: object

The easiest way to extract **multiple columns** from a dataframe is by supplying a list of column names.

In [50]:
# fetch the 'name' and 'final exam' columns - this returns a DataFrame
df[ ['name', 'final exam'] ]

Unnamed: 0,name,final exam
0,Foo,94
1,Bar,72
2,Baz,81


## Filtering rows and columns

It is possible to use two sets of brackets to perform both row and column filters in one expression.

In [57]:
# find one row by its index, and fetch one column from the results - this returns a single value
df.loc[2]['final exam']

81

In [52]:
# filter rows by criteria, and fetch one column from the results - this returns a Series
df[ df['name'] != 'Baz']['midterm exam']

0    99
1    64
Name: midterm exam, dtype: int64

In [58]:
# filter rows, and fetch multiple columns from the results - this returns a DataFrame
df[ df['name'] != 'Baz'][ ['name', 'midterm exam'] ] 

Unnamed: 0,name,midterm exam
0,Foo,99
1,Bar,64


## Importing data from files
Pandas can import from a variety of common data file formats, including CSV, JSON, fixed-width column text, and more.

In [71]:
# open data about NYC jobs from https://data.cityofnewyork.us/City-Government/NYC-Jobs/kpav-sd4t
df = pd.read_csv('./NYC_Jobs.csv')

In [83]:
# show a few randomly-sampled rows
df.sample(3)

Unnamed: 0,Job ID,Agency,Posting Type,# Of Positions,Business Title,Civil Service Title,Title Classification,Title Code No,Level,Job Category,...,Additional Information,To Apply,Hours/Shift,Work Location 1,Recruitment Contact,Residency Requirement,Posting Date,Post Until,Posting Updated,Process Date
1021,424227,FINANCIAL INFO SVCS AGENCY,External,1,Web Application Developer,SENIOR IT ARCHITECT,Non-Competitive-5,95711,0,"Technology, Data & Innovation",...,P293,External applicants please visit https://a127-...,"Monday - Friday, 9am to 5pm.",,,New York City Residency is not required for th...,11/27/2019,,11/27/2019,04/13/2021
812,458502,DEPT OF PARKS & RECREATION,Internal,50,City Seasonal Aide,CITY SEASONAL AIDE,Non-Competitive-5,91406,0,Building Operations & Maintenance,...,THIS JOB VACANCY NOTICE IS ONLY FOR CITY SEASO...,Please submit a cover letter and resume. Park...,,Queens,,"Residency in New York City, Nassau, Orange, Ro...",02/19/2021,,04/06/2021,04/13/2021
474,246734,ADMIN FOR CHILDREN'S SVCS,External,1,Compliance Review Unit Supervisor,ASSOCIATE STAFF ANALYST,Competitive-1,12627,0,"Finance, Accounting, & Procurement",...,Section 424-A of the New York Social Services ...,Click on the Apply Now button.,,,,New York City residency is generally required ...,08/05/2016,,08/08/2016,04/13/2021


In [88]:
# look for good-paying jobs ( > $200,000) available for external candidates
df[ (df['Posting Type'] == 'External') & (df['Salary Frequency'] == 'Annual') & (df['Salary Range To'] >= 200000) ]

Unnamed: 0,Job ID,Agency,Posting Type,# Of Positions,Business Title,Civil Service Title,Title Classification,Title Code No,Level,Job Category,...,Additional Information,To Apply,Hours/Shift,Work Location 1,Recruitment Contact,Residency Requirement,Posting Date,Post Until,Posting Updated,Process Date
111,441015,DEPARTMENT OF TRANSPORTATION,External,2,Deputy General Counsel,EXECUTIVE AGENCY COUNSEL,Non-Competitive-5,95005,M5,Legal Affairs,...,,All resumes are to be submitted electronically...,,55 Water St Ny Ny,,New York City residency is generally required ...,08/21/2020,,08/21/2020,04/13/2021
176,432041,ADMIN FOR CHILDREN'S SVCS,External,1,"Deputy Commissioner, Child and Family Well-Being",DEPUTY DIRECTOR OF ADMINISTRAT,Non-Competitive-5,52485,M7,Social Services,...,Section 424-A of the New York Social Services ...,Click on Apply Now button,,,,New York City residency is generally required ...,02/04/2020,,02/04/2020,04/13/2021
195,457777,OFFICE OF MANAGEMENT & BUDGET,External,1,Assistant Director Citywide Grants,BUDGET ANALYST (OMB)-MANAGERIA,Pending Classification-2,0608A,M4,"Finance, Accounting, & Procurement Legal Affai...",...,"REQUIREMENTS: Assistant Director ($141,766+):...","For City employees, please go to Employee Self...",,255 Greenwich Street,,New York City residency is generally required ...,02/02/2021,,02/02/2021,04/13/2021
263,445754,NYC HOUSING AUTHORITY,External,1,Vice President for Operation Support Services,ADMINISTRATIVE HOUSING SUPERIN,Competitive-1,10019,M5,Building Operations & Maintenance,...,"1.\tNYCHA employees applying for promotional, ...",Click the Apply Now button.,,,,NYCHA has no residency requirements.,09/23/2020,,02/16/2021,04/13/2021
332,434222,NYC EMPLOYEES RETIREMENT SYS,External,1,COMPUTER SYSTEMS MANAGER,COMPUTER SYSTEMS MANAGER,Competitive-1,10050,M7,"Technology, Data & Innovation",...,,"TO APPLY FOR CONSIDERATION, PLEASE FORWARD A C...",,,,New York City Residency is not required for th...,02/18/2020,,02/18/2020,04/13/2021
369,459608,NYC EMPLOYEES RETIREMENT SYS,External,1,"ADMINISTRATIVE RETIREMENT BENEFITS SPECIALIST,...",ADMINISTRATIVE RETIREMENT BENE,Competitive-1,82986,M5,Administration & Human Resources,...,,"TO APPLY FOR CONSIDERATION, PLEASE FORWARD A C...",,,,New York City residency is generally required ...,03/15/2021,,03/15/2021,04/13/2021
401,459041,NYC EMPLOYEES RETIREMENT SYS,External,1,ADMINISTRATIVE STAFF ANALYST,ADMINISTRATIVE STAFF ANALYST (,Competitive-1,10026,M7,"Public Safety, Inspections, & Enforcement",...,,"TO APPLY FOR CONSIDERATION, PLEASE FORWARD A C...",,,,New York City residency is generally required ...,03/03/2021,30-APR-2021,03/03/2021,04/13/2021
438,458110,OFFICE OF THE COMPTROLLER,External,1,Chief Risk Officer,DIRECTOR OF INVESTMENTS (COMP,Non-Competitive-5,95612,MY,"Finance, Accounting, & Procurement",...,The selected candidate will be subject to the ...,Please visit our website at https://comptrolle...,,,,New York City residency is generally required ...,02/08/2021,,02/08/2021,04/13/2021
644,453677,NYC EMPLOYEES RETIREMENT SYS,External,1,ADMINISTRATIVE MANAGEMENT AUDITOR,ADMINISTRATIVE MANAGEMENT AUDI,Competitive-1,10010,M5,Administration & Human Resources,...,,"TO APPLY FOR CONSIDERATION, PLEASE FORWARD A C...",,,,New York City residency is generally required ...,11/17/2020,,11/17/2020,04/13/2021
734,441706,TAXI & LIMOUSINE COMMISSION,External,1,General Counsel/Deputy Commissioner for Legal ...,EXECUTIVE AGENCY COUNSEL,Non-Competitive-5,95005,M6,Legal Affairs,...,,"Click, APPLY NOW Current city employees must a...",,"33 Beaver St, New York Ny",,New York City residency is generally required ...,07/13/2020,,07/13/2020,04/13/2021
