# Pandas accessors

- [str accessor docs](https://pandas.pydata.org/docs/reference/series.html#api-series-str)

Within Pandas we have several types of accessors and the two most common are: string `str` and datetime `dt`. They will allow us to work on particular data type and execute operations dedicated to this data type, like making all letters uppercase in a string. 

In [1]:
import pandas as pd

In [6]:
url = 'https://github.com/alx2202/DataAnalysis/raw/main/Day13/emps.csv'
emps = pd.read_csv(url, sep=';', encoding='utf-8', index_col='employee_id', parse_dates=['hire_date'])
emps

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America


In [7]:
emps.dtypes

first_name                 object
last_name                  object
job_title                  object
salary                      int64
hire_date          datetime64[ns]
department_name            object
address                    object
postal_code                object
city                       object
country                    object
dtype: object

## `str` accessors

In [9]:
emps.last_name.str.upper()

employee_id
100       KING
101    KOCHHAR
102    DE HAAN
103     HUNOLD
104      ERNST
        ...   
202        FAY
203     MAVRIS
204       BAER
205    HIGGINS
206      GIETZ
Name: last_name, Length: 107, dtype: object

In [10]:
emps.last_name.str.lower()

employee_id
100       king
101    kochhar
102    de haan
103     hunold
104      ernst
        ...   
202        fay
203     mavris
204       baer
205    higgins
206      gietz
Name: last_name, Length: 107, dtype: object

With `str` accessor we do have access to indexing operator we can use on strings.

In [11]:
emps.last_name.str[0:3]

employee_id
100    Kin
101    Koc
102    De 
103    Hun
104    Ern
      ... 
202    Fay
203    Mav
204    Bae
205    Hig
206    Gie
Name: last_name, Length: 107, dtype: object

In [15]:
emps.last_name.str[-3:]

employee_id
100    ing
101    har
102    aan
103    old
104    nst
      ... 
202    Fay
203    ris
204    aer
205    ins
206    etz
Name: last_name, Length: 107, dtype: object

In [16]:
emps.last_name.str.replace('K', 'X')

employee_id
100       Xing
101    Xochhar
102    De Haan
103     Hunold
104      Ernst
        ...   
202        Fay
203     Mavris
204       Baer
205    Higgins
206      Gietz
Name: last_name, Length: 107, dtype: object

In [19]:
emps.last_name.str.replace('K.*', 'K', regex=True)

employee_id
100          K
101          K
102    De Haan
103     Hunold
104      Ernst
        ...   
202        Fay
203     Mavris
204       Baer
205    Higgins
206      Gietz
Name: last_name, Length: 107, dtype: object

In [20]:
emps.last_name.str.match('.*in.*')  # regular expression

employee_id
100     True
101    False
102    False
103    False
104    False
       ...  
202    False
203    False
204    False
205     True
206    False
Name: last_name, Length: 107, dtype: bool

In [21]:
emps[emps.last_name.str.match('.*in.*')]

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
105,David,Austin,Programmer,4800,2007-06-25,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
122,Payam,Kaufling,Stock Manager,7900,2005-05-01,Shipping,2011 Interiors Blvd,99236,South San Francisco,United States of America
126,Irene,Mikkilineni,Stock Clerk,2700,2008-09-28,Shipping,2011 Interiors Blvd,99236,South San Francisco,United States of America
130,Mozhe,Atkinson,Stock Clerk,2800,2007-10-30,Shipping,2011 Interiors Blvd,99236,South San Francisco,United States of America
133,Jason,Mallin,Stock Clerk,3300,2006-06-14,Shipping,2011 Interiors Blvd,99236,South San Francisco,United States of America
151,David,Bernstein,Sales Representative,9500,2007-03-24,Sales,"Magdalen Centre, The Oxford Science Park",OX9 9ZB,Oxford,United Kingdom
156,Janette,King,Sales Representative,10000,2006-01-30,Sales,"Magdalen Centre, The Oxford Science Park",OX9 9ZB,Oxford,United Kingdom
164,Mattea,Marvins,Sales Representative,7200,2010-01-24,Sales,"Magdalen Centre, The Oxford Science Park",OX9 9ZB,Oxford,United Kingdom
177,Jack,Livingston,Sales Representative,8400,2008-04-23,Sales,"Magdalen Centre, The Oxford Science Park",OX9 9ZB,Oxford,United Kingdom
