In [1]:
import pandas as pd

Everything is an object.

Different types of objects come with different, built-in methods and attributes.

A pandas DataFrame is the most common type of object we work with. 

*Functions* are tools that do things. 

*Arguments* passed to functions specify how they behave. Arguments can be passed by *position* or by *keyword*. 

Functions are *called* by executing code with the function name and enclosing parentheses. Arguments are passed within these parentheses.  

When a function is called, it executes a routine and may *return* a value or another object. 

`pd.read_csv()` is a function that reads a file and returns a DataFrame. It requires a file path or url as an argument. This can be passed as a positional argument by simply passing it first, within the function-calling parentheses. This argument specifies which file to open. 

In [None]:
weather = pd.read_csv('https://raw.githubusercontent.com/dlevine01/urban-data-analysis-course/refs/heads/main/Data/Source%20Data/weather_data_nyc_centralpark_2016.csv')

This function also accepts many other _keyword arguments_, passed to their keyword (with an `=` equals sign) in the function call. For example, the `parse_dates` argument specifies which columns should be turned to dates and `date_format` specifies the format.

In [30]:
weather = pd.read_csv(
    'https://raw.githubusercontent.com/dlevine01/urban-data-analysis-course/refs/heads/main/Data/Source%20Data/weather_data_nyc_centralpark_2016.csv',
    parse_dates=['date'],
    date_format='%d-%m-%Y'
)

(Note that the `=` (equals sign) is used for two different purposes.

On the first line above, it is used for *assignment*, storing the value of the DataFrame returned by the function to the name we assign it, `weather`

Within the function call, equals signs are used to pass values to keyword arguments.

By convention (to make code clearer to read), when an equals sign is used for assignment it is set apart with spaces, while when it is used to specify keyword arguments, it does not have spaces.)

Objects have attributes. Attributes are properties of an object. Objects are accessed by a dot after the object reference. 

DataFrames have a `.shape` attribute showing their dimensions:

In [8]:
weather.shape

(366, 7)

DataFrames have a `.columns` attribute listing their columns:

In [9]:
weather.columns

Index(['date', 'maximum temperature', 'minimum temperature',
       'average temperature', 'precipitation', 'snow fall', 'snow depth'],
      dtype='object')

Objects have methods. Methods are functions which act on these objects. 

Like other functions, methods can take arguments to specify their behavior

For example, DataFrames have a `.sample()` method with returns randomly-sampled row(s):

In [31]:
weather.sample()

Unnamed: 0,date,maximum temperature,minimum temperature,average temperature,precipitation,snow fall,snow depth
217,2016-08-05,83,69,76.0,0,0,0


`.sample()` takes an argument for the number of rows to return.

This can be specified with an (unlabeled) positional argument:

In [32]:
weather.sample(5)

Unnamed: 0,date,maximum temperature,minimum temperature,average temperature,precipitation,snow fall,snow depth
80,2016-03-21,50,32,41.0,0.06,0.5,T
145,2016-05-25,88,61,74.5,0.00,0.0,0
70,2016-03-11,68,48,58.0,0.06,0.0,0
244,2016-09-01,79,69,74.0,0.5,0.0,0
297,2016-10-24,62,47,54.5,T,0.0,0


or, equivalently, with a keyword argument

In [33]:
weather.sample(n=5)

Unnamed: 0,date,maximum temperature,minimum temperature,average temperature,precipitation,snow fall,snow depth
66,2016-03-07,60,36,48.0,0.0,0.0,0
323,2016-11-19,63,37,50.0,0.25,0.0,0
55,2016-02-25,61,37,49.0,0.02,0.0,0
350,2016-12-16,27,17,22.0,0.0,0.0,0
161,2016-06-10,77,57,67.0,0.0,0.0,0


DataFrame objects are composed of Series objects. Series represent individual columns (or rows) of data. They have additional attributes and methods. 

For example the `.mean()` method returns the mean of the Series

In [21]:
(
    weather['maximum temperature'].mean()
)

64.6256830601093

There are additional, specialized attributes and methods for different data types. These are accessed with *dot accessors*. For example datetime methods are accessed with `.dt`

For example datetime methods are accessed with `.dt`

The `.dt.day` attribute shows the day (of the month) of a datetime object

In [35]:
weather['date'].dt.day

0       1
1       2
2       3
3       4
4       5
       ..
361    27
362    28
363    29
364    30
365    31
Name: date, Length: 366, dtype: int32

The `.dt.day_name()` method returns the day name

In [38]:
weather['date'].dt.day_name()

0         Friday
1       Saturday
2         Sunday
3         Monday
4        Tuesday
         ...    
361      Tuesday
362    Wednesday
363     Thursday
364       Friday
365     Saturday
Name: date, Length: 366, dtype: object