In [82]:
import pandas as pd


## Reading data from files
- https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

What we need to do when creating a dataframe:

- do we have a coulmn that can be an index for the data frame
- do we have any columns that contain dates so we can parse them properly into proper data type



In [83]:
url = 'https://raw.githubusercontent.com/piotrgradzinski/dap_20230114/main/day_6_pgg/emps.csv'
emps = pd.read_csv(url, sep=';', encoding='utf-8', index_col='employee_id', parse_dates=['hire_date'])
emps

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America


## Data exploration

What we should explore:

- what data type we have in each column, are they proper ?
- do we have nulls or NaN values in any column - it's possible that we'll have to deal with them somehow.

In [None]:
type(emps)


In [None]:
emps.dtypes

In [None]:
emps.info()

In [None]:
# 25 percentile means that 25% of salaries are equal or lower than 3 100.
emps.describe()

In [None]:
emps.describe(include='all')

In [None]:
emps.columns

In [None]:
emps.shape  # how many rows (.shape[0]) and columns (.shape[1])


In [None]:
len(emps), emps.size


### How we can access data in a DataFrame

In [None]:
# dictionary notation to access column in a data frame
emps['last_name']

In [None]:
# object notation to access column in a data frame
# we can use this notation if the name of the column does not contain space nor special characters
emps.salary

In [None]:
type(emps.salary)


When we are using dictionary notation we have few additional features we can use. For example we can access several columns at once.

In [None]:
emps[['first_name', 'last_name', 'salary']]  # providing a list of columns

In [None]:
emps.salary.mean()

## Accessing data using loc and iloc

We can use `loc` and `iloc` to access some portion of the data, either particular cell, several cell or a row:

- [loc ](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html)-  label index or "business index" where we can use as well column names
- [iloc](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html) - integer based index, starts with 0

### `iloc`

In [None]:
emps.iloc[0] # first row

In [None]:
emps.iloc[0, 1] # first row, second column

We can use similar operations with `iloc` as with accessing elements list elements:

- negative indexes
- range of elements - ```start:stop:step```

We can do that for both dimensions (rows and columns).


In [None]:
emps.iloc[0:5] # first 5 rows

In [None]:
emps.iloc[0:10:2] # every second row from 0 to 10


In [None]:
emps.iloc[-1] # last row

In [None]:
emps.iloc[:10] # first 10 rows

In [None]:
emps.iloc[100:] # rows from 100 to the end


Instead of a particular index or a range I can provide a list of indexes which I want to get, either from a rows or columns.

In [None]:
emps.iloc[0:5, [0, 3, -2]] # first 5 rows, first, fourth and last but one column

### `Loc`

In [None]:
emps.loc[100] # row with index 100

In [None]:
emps.loc[100, 'first_name'] # first name of employee with index 100

In [None]:
emps.loc[100:105, ['first_name', 'last_name']]


We can use ranges with `loc` as well but those are both sides closed.


In [None]:
emps.loc[100:110:2, ['first_name', 'last_name']]


Ranges can work on a column level as well.

In [None]:
emps.loc[100:110:2, 'first_name':'address'] # first name to address columns for every second row from 100 to 110
emps.loc[100:110:2, 'first_name':'address':2] # first name to address columns for every second row from 100 to 110, every second column

## How we can iterate through a DataFrame using a for loop

By default we are iterating through column names in a DataFrame. If we want to iterate by rows we can use [.iterrows()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iterrows.html) method on a DataFrame.

Using this approach is a good idea if we want to present the data in the way we want, which is different than just displaying the DataFrame (or its selection). We shouldn't use this approach to do calculations because they will be much slower than using Pandas (or NumPy) built-in methods.



In [None]:
for column_name in emps: # iterating over columns
    print(column_name)

In [None]:
for emp_id, emp in emps.iterrows(): # iterating over rows (index, row)  # iterrows() is a generator function 
    print(emp_id, emp['first_name'], emp['last_name']) # we can access columns using dictionary notation

In [None]:
for emp_id, emp in emps.iterrows(): # iterating over rows (index, row)  # iterrows() is a generator function
    print(emp_id, emp.first_name, emp.last_name) # we can access columns using object notation

## Filtering and logical conditions

On a Series we can use comparison operatators that will return a mask for which we will have True/False values saying whether particular value fulfills the condition or not.

Once I ahve a mask I can use it to filter elements from the DataFrame using indexing operator.



In [None]:
emps.salary > 10_000 # returns a series of booleans


In [None]:
emps[emps.salary > 10_000] # returns a data frame with rows where salary is greater than 10 000


In [None]:
emps[emps['city'] == 'Oxford'] # returns a data frame with rows where city is equal to Oxford


In [None]:
emps[emps['city'] == 'Oxford'].salary.mean()
##emps[emps['city'] == 'London'].salary.mean()


First we take a `salary` column and the filtering the data by the city. This is possible, because all series are sharing the index (`employee_id`).


In [None]:
emps.salary[emps.city == 'Oxford'].mean()

If we want to combine several conditions together, we can't use pythons` and, or` operators, they will not work with Pandas. We have to use, so called, bit-wise operators `&` (for `and`) and `|` (for `or`) to connect several conditions together. Due to the fact the `&` and `|` are stronger than standard comparison operators (they take precedence) to make the statement work we need to use `()`.



In [None]:
emps[(emps.city == 'Oxford') & (emps.salary >= 10_000)] # returns a data frame with rows where city is equal to Oxford and salary is greater than or equal to 10 000

## Exercises

List employees from Seattle with column: first_name, last_name, salary and city.



In [None]:
emps[emps.city == 'Seattle'][['first_name', 'last_name', 'salary']]
# providing columns by their names, providing emps.first_name means that we provide not a column name but the whole column
# whole Series object.


In [None]:
emps[emps.city == 'Seattle'].loc[:, ['first_name', 'last_name', 'salary', 'city']]


One interesting feature of loc is that we can provide a mask/condition to one of the dimensions.

emps.loc[emps.city == 'Seattle', ['first_name', 'last_name', 'salary', 'city']]


In [None]:
emps.loc[(emps.city == 'Seattle') | (emps.city == 'Oxford'), ['first_name', 'last_name', 'salary', 'city']]

Using `iloc` get 10 first employees from Oxford.



In [None]:
emps[emps.city == 'Oxford'].iloc[: 10] # first 10 rows of employees from Oxford

[`.head(X)`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html) will return X first rows from the DataFrame.

In [None]:
emps[emps.city == 'Oxford'].head(10)


[`.tail(X)`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.tail.html#)

In [None]:
emps[emps.city == 'Oxford'].tail(10) # last 10 rows of employees from Oxford

Calculate mean salary of first 10 employees from Oxford.

- filter the data
- get 10 employess
- get salary column
- execute mean on this column.

In [None]:
emps[emps.city == 'Oxford'].iloc[:10].salary.mean() # mean salary of first 10 employees from Oxford

## Data modification
DataFrame object is mutable, which means we can change the content of the DatFrame.

In this example we have two variables ``emps`` and ```my_df``` that are pointing to the same object, same DataFrame in memory. So changing data using one will cause the change for the other, it's not a copy, wa are working on the same DataFrame.



In [None]:
my_df = emps  # WE ARE NOT COPYING THE DATAFRAME!
emps


In [None]:
my_df

In [None]:
my_df.iloc[0,0] = 'John' # changing the value in the first row, first column
my_df

In [None]:
emps

## How we can copy whole DataFrame or Series?
`.copy()` - copies a DataFrame and we can assign it to another variable. This operation means that we will have a new object in memory.

Once we will perform changes on a copy then, the original object will not be impacted.


In [84]:
emps_copy = emps.copy() # WE ARE COPYING THE DATAFRAME!

In [85]:
emps_copy.iloc[0, 1] = 'DOE'
emps_copy


Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,DOE,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America


In [86]:
emps

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America


In [88]:
emps_copy.salary += 123 # adding 123 to all salaries
emps_copy

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,DOE,President,24246,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
101,Neena,Kochhar,Administration Vice President,17246,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America
102,Lex,De Haan,Administration Vice President,17246,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America
103,Alexander,Hunold,Programmer,9246,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
104,Bruce,Ernst,Programmer,6246,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6246,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada
203,Susan,Mavris,Human Resources Representative,6746,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom
204,Hermann,Baer,Public Relations Representative,10246,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany
205,Shelley,Higgins,Accounting Manager,12246,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America


In [89]:
emps_seattle = emps.loc[emps.city == 'Seattle', ['first_name', 'last_name', 'salary', 'city']]
emps_seattle

Unnamed: 0_level_0,first_name,last_name,salary,city
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
100,Steven,King,24000,Seattle
101,Neena,Kochhar,17000,Seattle
102,Lex,De Haan,17000,Seattle
108,Nancy,Greenberg,12000,Seattle
109,Daniel,Faviet,9000,Seattle
110,John,Chen,8200,Seattle
111,Ismael,Sciarra,7700,Seattle
112,Jose Manuel,Urman,7800,Seattle
113,Luis,Popp,6900,Seattle
114,Den,Raphaely,11000,Seattle


In [90]:
emps_seattle.salary += 11
emps_seattle


Unnamed: 0_level_0,first_name,last_name,salary,city
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
100,Steven,King,24011,Seattle
101,Neena,Kochhar,17011,Seattle
102,Lex,De Haan,17011,Seattle
108,Nancy,Greenberg,12011,Seattle
109,Daniel,Faviet,9011,Seattle
110,John,Chen,8211,Seattle
111,Ismael,Sciarra,7711,Seattle
112,Jose Manuel,Urman,7811,Seattle
113,Luis,Popp,6911,Seattle
114,Den,Raphaely,11011,Seattle


In [96]:
emps # emps is not changed
# Why ? Because we are not changing the original data frame, we are changing a copy of the data frame.
# How to change the original data frame ? We need to use .loc[] or .iloc[] to change the values in the original data frame.
# We can also use .at[] or .iat[] to change the values in the original data frame.
# .at[] and .iat[] are faster than .loc[] and .iloc[].
# sample code: emps.loc[emps.city == 'Seattle', 'salary'] += 11 this will change the original data frame.

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America


In [98]:
emps['yearly_salary'] =12 * emps['salary'] # adding a new column
emps

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country,yearly_salary
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America,288000
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America,204000
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America,204000
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America,108000
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America,72000
...,...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada,72000
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom,78000
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany,120000
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America,144000


To remove columns or rows from the DataFrame we can use [`.drop()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop.html)

A lot of Pandas method, that in particular change the DataFrame, instead of modifiying the original object, they are returning a copy of the original DataFrame with applied modifications.

This is a default behaviour which we can change. We need check the documentation of the particular method and if we have parameter inplace we can use this parameter to indicate that we want to modify the original DataFrame.




In [99]:
emps.drop(columns=['postal_code', 'country', 'yearly_salary']) # dropping columns

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,city
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,Seattle
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,Seattle
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,Seattle
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,Southlake
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,Southlake
...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,Toronto
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,London
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,Munich
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,Seattle


In [100]:
emps # emps is not changed because we did not use inplace=True

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country,yearly_salary
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America,288000
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America,204000
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America,204000
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America,108000
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America,72000
...,...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada,72000
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom,78000
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany,120000
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America,144000


In [101]:
emps.drop(columns=['yearly_salary'], inplace=True)
emps

Unnamed: 0_level_0,first_name,last_name,job_title,salary,hire_date,department_name,address,postal_code,city,country
employee_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
100,Steven,King,President,24000,1997-06-17,Executive,2004 Charade Rd,98199,Seattle,United States of America
101,Neena,Kochhar,Administration Vice President,17000,1999-09-21,Executive,2004 Charade Rd,98199,Seattle,United States of America
102,Lex,De Haan,Administration Vice President,17000,2003-01-13,Executive,2004 Charade Rd,98199,Seattle,United States of America
103,Alexander,Hunold,Programmer,9000,2000-01-03,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
104,Bruce,Ernst,Programmer,6000,2001-05-21,IT,2014 Jabberwocky Rd,26192,Southlake,United States of America
...,...,...,...,...,...,...,...,...,...,...
202,Pat,Fay,Marketing Representative,6000,2007-08-17,Marketing,147 Spadina Ave,M5V 2L7,Toronto,Canada
203,Susan,Mavris,Human Resources Representative,6500,2004-06-07,Human Resources,8204 Arthur St,,London,United Kingdom
204,Hermann,Baer,Public Relations Representative,10000,2004-06-07,Public Relations,Schwanthalerstr. 7031,80925,Munich,Germany
205,Shelley,Higgins,Accounting Manager,12000,2004-06-07,Accounting,2004 Charade Rd,98199,Seattle,United States of America
