## <u>Pandas</u>

#### Pandas is a Python library. Pandas is used to analyze data.

### <u> Key Components </u>
#### DataFrames: The primary data structure in Pandas, resembling a table or a spreadsheet. It's essentially a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns).
#### Series: A one-dimensional array-like object that can hold any data type. It's similar to a column in a DataFrame.

In [23]:
import pandas as pd

In [15]:
## Loading Data: Read data from various formats like CSV, Excel, SQL databases, and JSON

#### The read_csv() function in Pandas is used to read data from a CSV (Comma-Separated Values) file and load it into a DataFrame. CSV files, short for Comma-Separated Values files, are a popular format for storing and exchanging tabular data. Each line in a CSV file represents a row in the table, with values separated by commas (or other delimiters). This format is simple and widely supported by various software applications, including spreadsheets and databases.
#### Characteristics of CSV files: Plain text, Comma-Separated Values, Header row, Consistent Structure, elimiter Variations

## Key Parameters
#### 'filepath_or_buffer': The path to the CSV file or a file-like object (e.g., a URL or a StringIO object).
#### 'sep': The delimiter used in the file. By default, this is a comma (,), but it can be changed to other delimiters like tabs (\t) or semicolons (;).
#### 'header': Row number(s) to use as the column names. The default is 0, meaning the first row is used as column headers. Set this to None if there is no header row.
#### 'index_col': Column(s) to set as the index of the DataFrame. This can be a column name or a column index.
#### 'usecols': Specifies which columns to read. This can be a list of column names or indices.
#### 'dtype': Data type for data or columns. You can specify a dictionary to enforce certain data types.
#### 'na_values': Additional strings to recognize as NA/NaN.
#### 'parse_dates': Indicates which columns should be parsed as dates. This can be a single column name, a list of column names, or a dictionary mapping columns to date parsers.
#### 'skiprows': Number of rows to skip at the beginning of the file or a list of row indices to skip.
#### 'nrows': Number of rows to read from the file.

In [36]:
df = pd.read_csv("services.csv")

In [31]:
pd.read_csv(
    'services.csv',
    sep=',',
    header=0,
    index_col='id',
    usecols=['id', 'name', 'audience'],
    na_values=['NA', 'NULL']
)

Unnamed: 0_level_0,audience,name
id,Unnamed: 1_level_1,Unnamed: 2_level_1
1,"Older adults age 55 or over, ethnic minorities...",Fair Oaks Adult Activity Center
2,Residents of San Mateo County age 55 or over,Second Career Employment Program
3,Older adults age 55 or over who can benefit fr...,Senior Peer Counseling
4,"Parents, children, families with problems of c...",Family Visitation Center
5,Low-income working families with children tran...,Economic Self-Sufficiency Program
6,Any age,Little House Recreational Activities
7,"Older adults who have memory or sensory loss, ...",Rosener House Adult Day Services
8,"Senior citizens age 60 or over, disabled indiv...",Meals on Wheels - South County
9,"Ethnic minorities, especially Spanish speaking",Fair Oaks Branch
10,,Main Library


#### The head() and tail() functions in Pandas are used to quickly inspect the contents of a DataFrame or Series. They provide a way to view the beginning or end of the data without having to print the entire dataset, which can be especially useful for large datasets.

#### 'head(n)': Shows the first n rows of the DataFrame or Series (default is 5).
#### 'tail(n)': Shows the last n rows of the DataFrame or Series (default is 5).

In [41]:
df.head() #displays top 5 records by default 

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,
3,4,4,,,,Apply by phone.,"Parents, children, families with problems of c...",Provides supervised visitation services and a ...,,,...,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr...",,Family Visitation Center,,San Mateo County,active,No wait.,,
4,5,5,,,,Phone for information.,Low-income working families with children tran...,Provides fixed 8% short term loans to eligible...,Eligibility: Low-income family with legal cust...,,...,,"COMMUNITY SERVICES, Speakers, Automobile Loans",,Economic Self-Sufficiency Program,,San Mateo County,active,,,


In [43]:
df.head(3)

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
0,1,1,,,,Walk in or apply by phone.,"Older adults age 55 or over, ethnic minorities...",A walk-in center for older adults that provide...,"Age 55 or over for most programs, age 60 or ov...",,...,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites...",,Fair Oaks Adult Activity Center,,Colma,active,No wait.,,
1,2,2,,,,Apply by phone for an appointment.,Residents of San Mateo County age 55 or over,Provides training and job placement to eligibl...,"Age 55 or over, county resident and willing an...",,...,,"EMPLOYMENT/TRAINING SERVICES, Job Development,...",,Second Career Employment Program,,San Mateo County,active,Varies.,,
2,3,3,,,,Phone for information (403-4300 Ext. 4322).,Older adults age 55 or over who can benefit fr...,Offers supportive counseling services to San M...,Resident of San Mateo County age 55 or over,,...,,"Geriatric Counseling, Older Adults, Gay, Lesbi...",,Senior Peer Counseling,,San Mateo County,active,Varies.,,


In [45]:
df.tail()

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
18,19,19,,,,Call for screening appointment (650-347-3648).,,Provides free medical and dental care to those...,Low-income person without access to health care,,...,,"HEALTH SERVICES, Outpatient Care, Community Cl...",,San Mateo Free Medical Clinic,,"Belmont, Burlingame",active,Varies.,,
19,20,20,,,,Walk in.,,no unrequired fields for this service,,,...,,,,Service with blank fields,,,defunct,,,
20,21,21,,,,By phone during business hours.,,just a test service,,,...,,,,Service for Admin Test Location,,San Mateo County,inactive,,,
21,22,22,,"Cash, Check, Credit Card",Fotos para pasaportes,Walk in or apply by phone or mail,"Profit and nonprofit businesses, the public, m...",[NOTE THIS IS NOT A REAL SERVICE--THIS IS FOR ...,,passports@example.org,...,We offer 3-way interpretation services over th...,"Salud, Medicina",Spanish,Passport Photos,Government-issued picture identification,"Alameda County, San Mateo County",active,No wait to 2 weeks.,http://www.example.com,"105, 108, 108-05, 108-05-01, 111, 111-05"
22,23,22,,,,Walk in or apply by phone or mail,"Second service and nonprofit businesses, the p...",[NOTE THIS IS NOT A REAL ORGANIZATION--THIS IS...,,,...,,"Ruby on Rails/Postgres/Redis, testing, wic",,Example Service Name,,"San Mateo County, Alameda County",active,No wait to 2 weeks,http://www.example.com,


In [47]:
df.tail(3)

Unnamed: 0,id,location_id,program_id,accepted_payments,alternate_name,application_process,audience,description,eligibility,email,...,interpretation_services,keywords,languages,name,required_documents,service_areas,status,wait_time,website,taxonomy_ids
20,21,21,,,,By phone during business hours.,,just a test service,,,...,,,,Service for Admin Test Location,,San Mateo County,inactive,,,
21,22,22,,"Cash, Check, Credit Card",Fotos para pasaportes,Walk in or apply by phone or mail,"Profit and nonprofit businesses, the public, m...",[NOTE THIS IS NOT A REAL SERVICE--THIS IS FOR ...,,passports@example.org,...,We offer 3-way interpretation services over th...,"Salud, Medicina",Spanish,Passport Photos,Government-issued picture identification,"Alameda County, San Mateo County",active,No wait to 2 weeks.,http://www.example.com,"105, 108, 108-05, 108-05-01, 111, 111-05"
22,23,22,,,,Walk in or apply by phone or mail,"Second service and nonprofit businesses, the p...",[NOTE THIS IS NOT A REAL ORGANIZATION--THIS IS...,,,...,,"Ruby on Rails/Postgres/Redis, testing, wic",,Example Service Name,,"San Mateo County, Alameda County",active,No wait to 2 weeks,http://www.example.com,


In [49]:
type(df)

pandas.core.frame.DataFrame

### Understanding the Output
#### pandas: This indicates that the object is part of the Pandas library.
#### core: Refers to the core components of the Pandas library.
#### frame: Specifies that this is related to the DataFrame object.
#### DataFrame: The class name indicating that the object is an instance of the DataFrame class.

In [54]:
# In Pandas, the 'columns' attribute of a DataFrame provides the labels of the columns.

In [56]:
df.columns

Index(['id', 'location_id', 'program_id', 'accepted_payments',
       'alternate_name', 'application_process', 'audience', 'description',
       'eligibility', 'email', 'fees', 'funding_sources',
       'interpretation_services', 'keywords', 'languages', 'name',
       'required_documents', 'service_areas', 'status', 'wait_time', 'website',
       'taxonomy_ids'],
      dtype='object')

In [58]:
list(df.columns)

['id',
 'location_id',
 'program_id',
 'accepted_payments',
 'alternate_name',
 'application_process',
 'audience',
 'description',
 'eligibility',
 'email',
 'fees',
 'funding_sources',
 'interpretation_services',
 'keywords',
 'languages',
 'name',
 'required_documents',
 'service_areas',
 'status',
 'wait_time',
 'website',
 'taxonomy_ids']

In [60]:
df['service_areas'] # interested in a particular column

0                                     Colma
1                          San Mateo County
2                          San Mateo County
3                          San Mateo County
4                          San Mateo County
5                          San Mateo County
6       Belmont, Burlingame, East Palo Alto
7                   Belmont, East Palo Alto
8                          San Mateo County
9                          San Mateo County
10                         San Mateo County
11                                Daly City
12                         San Mateo County
13      Belmont, Burlingame, East Palo Alto
14         Alameda County, San Mateo County
15                                      NaN
16    Colma, Daly City, South San Francisco
17                           East Palo Alto
18                      Belmont, Burlingame
19                                      NaN
20                         San Mateo County
21         Alameda County, San Mateo County
22         San Mateo County, Ala

In [62]:
type(df['service_areas']) # Series

pandas.core.series.Series

In [47]:
list(df['status'])

['active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'active',
 'defunct',
 'inactive',
 'active',
 'active']

In [51]:
df[['service_areas']]

Unnamed: 0,service_areas
0,Colma
1,San Mateo County
2,San Mateo County
3,San Mateo County
4,San Mateo County
5,San Mateo County
6,"Belmont, Burlingame, East Palo Alto"
7,"Belmont, East Palo Alto"
8,San Mateo County
9,San Mateo County


In [53]:
type(df[['service_areas']]) # DataFrame

pandas.core.frame.DataFrame

#### A Series is essentially a one-dimensional array-like object that holds data of a single type, such as integers, strings, or floats. Think of it as a single column of data with an index that labels each element. It's useful for handling individual columns or single-dimensional data.

#### On the other hand, a DataFrame is a two-dimensional, tabular data structure that can be likened to a spreadsheet or a SQL table. It consists of multiple columns, each potentially of a different type, organized into rows and columns with both row and column indices. This makes it ideal for working with structured data where you need to manage multiple dimensions of information simultaneously.

In [65]:
df[['email', 'keywords']] # multiple columns can be accessed by passing the list of attributes

Unnamed: 0,email,keywords
0,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites..."
1,,"EMPLOYMENT/TRAINING SERVICES, Job Development,..."
2,,"Geriatric Counseling, Older Adults, Gay, Lesbi..."
3,,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr..."
4,,"COMMUNITY SERVICES, Speakers, Automobile Loans"
5,,"ADULT PROTECTION AND CARE SERVICES, In-Home Su..."
6,,"ADULT PROTECTION AND CARE SERVICES, Adult Day ..."
7,,"ADULT PROTECTION AND CARE SERVICES, Meal Sites..."
8,,"EDUCATION SERVICES, Library, Libraries, Public..."
9,,"EDUCATION SERVICES, Library, Libraries, Public..."


In [67]:
df[['email', 'name', 'keywords']]

Unnamed: 0,email,name,keywords
0,,Fair Oaks Adult Activity Center,"ADULT PROTECTION AND CARE SERVICES, Meal Sites..."
1,,Second Career Employment Program,"EMPLOYMENT/TRAINING SERVICES, Job Development,..."
2,,Senior Peer Counseling,"Geriatric Counseling, Older Adults, Gay, Lesbi..."
3,,Family Visitation Center,"INDIVIDUAL AND FAMILY DEVELOPMENT SERVICES, Gr..."
4,,Economic Self-Sufficiency Program,"COMMUNITY SERVICES, Speakers, Automobile Loans"
5,,Little House Recreational Activities,"ADULT PROTECTION AND CARE SERVICES, In-Home Su..."
6,,Rosener House Adult Day Services,"ADULT PROTECTION AND CARE SERVICES, Adult Day ..."
7,,Meals on Wheels - South County,"ADULT PROTECTION AND CARE SERVICES, Meal Sites..."
8,,Fair Oaks Branch,"EDUCATION SERVICES, Library, Libraries, Public..."
9,,Main Library,"EDUCATION SERVICES, Library, Libraries, Public..."


## '<u>dtypes</u>'

#### In a DataFrame: The dtypes attribute shows the data type of each column. Each column in a DataFrame can hold data of a specific type, such as integers, floats, strings, or dates. The dtypes attribute helps you understand what kind of data each column contains.

#### In a Series: The dtypes attribute shows the data type of the Series itself. Since a Series is essentially a single column, it has only one data type.

In [63]:
df.dtypes

id                           int64
location_id                  int64
program_id                 float64
accepted_payments           object
alternate_name              object
application_process         object
audience                    object
description                 object
eligibility                 object
email                       object
fees                        object
funding_sources             object
interpretation_services     object
keywords                    object
languages                   object
name                        object
required_documents          object
service_areas               object
status                      object
wait_time                   object
website                     object
taxonomy_ids                object
dtype: object

In [72]:
df1 = pd.read_excel("LUSID Excel - Setting up your market data.xlsx")

In [74]:
type(df1)

pandas.core.frame.DataFrame

In [76]:
df1.dtypes

Unnamed: 0    float64
Unnamed: 1    float64
Unnamed: 2    float64
Unnamed: 3     object
Unnamed: 4     object
Unnamed: 5     object
Unnamed: 6    float64
Unnamed: 7     object
Unnamed: 8     object
Unnamed: 9     object
dtype: object

In [78]:
df1.columns

Index(['Unnamed: 0', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4',
       'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', 'Unnamed: 9'],
      dtype='object')

In [80]:
df1[['Unnamed: 8', 'Unnamed: 9']]

Unnamed: 0,Unnamed: 8,Unnamed: 9
0,,
1,,
2,,
3,,
4,,
5,,
6,,
7,,
8,,
9,,


In [82]:
df2 = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")

In [83]:
df2.head(3)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S


In [86]:
df2.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [88]:
type(df2)

pandas.core.frame.DataFrame

In [90]:
df2[['Ticket', 'Fare', 'Cabin']]

Unnamed: 0,Ticket,Fare,Cabin
0,A/5 21171,7.2500,
1,PC 17599,71.2833,C85
2,STON/O2. 3101282,7.9250,
3,113803,53.1000,C123
4,373450,8.0500,
...,...,...,...
886,211536,13.0000,
887,112053,30.0000,B42
888,W./C. 6607,23.4500,
889,111369,30.0000,C148


In [92]:
df2.tail(10)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
881,882,0,3,"Markun, Mr. Johann",male,33.0,0,0,349257,7.8958,,S
882,883,0,3,"Dahlberg, Miss. Gerda Ulrika",female,22.0,0,0,7552,10.5167,,S
883,884,0,2,"Banfield, Mr. Frederick James",male,28.0,0,0,C.A./SOTON 34068,10.5,,S
884,885,0,3,"Sutehall, Mr. Henry Jr",male,25.0,0,0,SOTON/OQ 392076,7.05,,S
885,886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39.0,0,5,382652,29.125,,Q
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.45,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0,C148,C
890,891,0,3,"Dooley, Mr. Patrick",male,32.0,0,0,370376,7.75,,Q


## '<u>The read_html() function</u>'

#### The read_html() function in Pandas is used to read HTML tables into a DataFrame. It is particularly useful for extracting tabular data from web pages. This function parses HTML content and converts the tables found in the HTML into Pandas DataFrames, allowing you to work with web data in a structured format.

## How read_html() Works

#### HTML Content: The function looks for '>table>' elements in the HTML content and converts them into DataFrames.
#### Multiple Tables: If there are multiple tables in the HTML content, read_html() can return a list of DataFrames, one for each table.
#### Input Types: You can provide a URL, file path, or raw HTML content as input.

In [99]:
url_data = pd.read_html("https://www.basketball-reference.com/leagues/NBA_2015_totals.html")

In [100]:
type(url_data)

list

In [103]:
len(url_data)

1

In [105]:
df3 = url_data[0]

In [107]:
type(df3)

pandas.core.frame.DataFrame

In [117]:
df3

Unnamed: 0,Rk,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
0,1,Quincy Acy,PF,24,NYK,68,22,1287,152,331,...,.784,79,222,301,68,27,22,60,147,398
1,2,Jordan Adams,SG,20,MEM,30,0,248,35,86,...,.609,9,19,28,16,16,7,14,24,94
2,3,Steven Adams,C,21,OKC,70,67,1771,217,399,...,.502,199,324,523,66,38,86,99,222,537
3,4,Jeff Adrien,PF,28,MIN,17,0,215,19,44,...,.579,23,54,77,15,4,9,9,30,60
4,5,Arron Afflalo,SG,29,TOT,78,72,2502,375,884,...,.843,27,220,247,129,41,7,116,167,1035
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
670,490,Thaddeus Young,PF,26,TOT,76,68,2434,451,968,...,.655,127,284,411,173,124,25,117,171,1071
671,490,Thaddeus Young,PF,26,MIN,48,48,1605,289,641,...,.682,75,170,245,135,86,17,75,115,685
672,490,Thaddeus Young,PF,26,BRK,28,20,829,162,327,...,.606,52,114,166,38,38,8,42,56,386
673,491,Cody Zeller,C,22,CHO,62,45,1487,172,373,...,.774,97,265,362,100,34,49,62,156,472


In [119]:
df3.columns

Index(['Rk', 'Player', 'Pos', 'Age', 'Tm', 'G', 'GS', 'MP', 'FG', 'FGA', 'FG%',
       '3P', '3PA', '3P%', '2P', '2PA', '2P%', 'eFG%', 'FT', 'FTA', 'FT%',
       'ORB', 'DRB', 'TRB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS'],
      dtype='object')

In [121]:
df3.head(5)

Unnamed: 0,Rk,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
0,1,Quincy Acy,PF,24,NYK,68,22,1287,152,331,...,0.784,79,222,301,68,27,22,60,147,398
1,2,Jordan Adams,SG,20,MEM,30,0,248,35,86,...,0.609,9,19,28,16,16,7,14,24,94
2,3,Steven Adams,C,21,OKC,70,67,1771,217,399,...,0.502,199,324,523,66,38,86,99,222,537
3,4,Jeff Adrien,PF,28,MIN,17,0,215,19,44,...,0.579,23,54,77,15,4,9,9,30,60
4,5,Arron Afflalo,SG,29,TOT,78,72,2502,375,884,...,0.843,27,220,247,129,41,7,116,167,1035


In [123]:
df3.tail(20)

Unnamed: 0,Rk,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
655,482,Shawne Williams,SF,28,TOT,63,22,1087,121,300,...,0.875,36,130,166,44,25,21,30,135,341
656,482,Shawne Williams,SF,28,MIA,44,22,924,102,240,...,0.848,27,112,139,36,21,17,23,116,292
657,482,Shawne Williams,SF,28,DET,19,0,163,19,60,...,1.0,9,18,27,8,4,4,7,19,49
658,483,Jeff Withey,C,24,NOP,37,0,259,32,64,...,0.68,23,41,64,11,4,18,12,26,98
659,484,Nate Wolters,PG,23,TOT,21,0,247,20,57,...,0.333,8,26,34,21,8,2,13,18,42
660,484,Nate Wolters,PG,23,MIL,11,0,142,12,31,...,0.25,2,14,16,10,5,0,3,11,25
661,484,Nate Wolters,PG,23,NOP,10,0,105,8,26,...,0.5,6,12,18,11,3,2,10,7,17
662,485,Brandan Wright,PF,27,TOT,75,7,1449,233,363,...,0.696,128,193,321,41,50,94,34,102,544
663,485,Brandan Wright,PF,27,DAL,27,0,505,101,135,...,0.75,51,59,110,10,17,42,13,44,238
664,485,Brandan Wright,PF,27,BOS,8,0,86,12,21,...,0.5,7,10,17,8,1,5,5,4,26


In [125]:
df3.dtypes

Rk        object
Player    object
Pos       object
Age       object
Tm        object
G         object
GS        object
MP        object
FG        object
FGA       object
FG%       object
3P        object
3PA       object
3P%       object
2P        object
2PA       object
2P%       object
eFG%      object
FT        object
FTA       object
FT%       object
ORB       object
DRB       object
TRB       object
AST       object
STL       object
BLK       object
TOV       object
PF        object
PTS       object
dtype: object

In [127]:
df3[['Pos', 'Age', 'Tm']]

Unnamed: 0,Pos,Age,Tm
0,PF,24,NYK
1,SG,20,MEM
2,C,21,OKC
3,PF,28,MIN
4,SG,29,TOT
...,...,...,...
670,PF,26,TOT
671,PF,26,MIN
672,PF,26,BRK
673,C,22,CHO


#### The to_csv() function in Pandas is used to write a DataFrame or Series to a CSV (Comma-Separated Values) file. This function is useful for saving data from a Pandas object to a CSV file, which can be easily shared, stored, or used by other programs and systems that support the CSV format.

## Key Parameters
#### path_or_buf: The file path or object to write to. If not provided, the result is returned as a string.
#### sep: The delimiter to use. Default is a comma (,). You can use other delimiters like a semicolon (;) or tab (\t).
#### index: Whether to write row names (index). Default is True. Set to False to omit the index.
#### columns: Sequence of column labels to write. If not specified, all columns are written.
#### header: Whether to write column names. Default is True. Set to False to omit headers.
#### encoding: Encoding format to use for the output file. For example, utf-8 or utf-16.
#### na_rep: String representation of missing values. Default is an empty string.
#### quotechar: Character used to quote fields containing special characters. Default is ".
#### line_terminator: Character to break lines on. Default is '\n'.

In [131]:
df3.to_csv('players_data.csv')
# df3.to_csv('players_data.csv', index = False)