# Pandas - Updating rows and columns - Modifying data within dataframes

Notebook structure:

1. Updating column names
    1. Renaming ALL the columns
    2. Changing something specific about ALL the column names
    3. Changing SOME column names
2. Updating data in our rows
    1. Changing ALL the values in a SINGLE row
    2. Changing a SINGLE value in a SINGLE row
    3. Changing MULTIPLE values in a SINGLE row, but NOT ALL of them
    4. Some other *indexers* in *pandas*
    5. Changing a SINGLE value in MULTIPLE rows
    6. `.apply()` method
        1. `.apply()` method on `Series` objects
        2. `.apply()` method on `DataFrame` objects
        3. Summary of the `.apply()` method
    7. `.applymap()` method
    8. `.map()` method
3. Now, let's go back to the stack overflow dataset and work with it

In [409]:
people = {
    "first": ["Nabeel", "Jane", "John"],
    "last": ["Malik", "Doe", "Doe"],
    "email": ["nabeel_malik@email.com", "jane_doe@email.com", "john_doe@email.com"]
}

In [410]:
import pandas as pd
df_from_dict = pd.DataFrame(people)

In [411]:
df_from_dict

Unnamed: 0,first,last,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


In [412]:
df_from_dict.columns

Index(['first', 'last', 'email'], dtype='object')

In [413]:
type(df_from_dict.columns)

pandas.core.indexes.base.Index

Note: The `DataFrame.columns` attribute has the `Index` *object* type.

## Updating column names

The part of the notebook structure related to this section is:
1. Updating column names
    1. Renaming ALL the columns
    2. Changing something specific about ALL the column names
    3. Changing SOME column names

### Renaming ALL the columns: `DataFrame.columns = [list]`:

To update ALL the columns, we can simply do an assignment using the `DataFrame.columns` attribute:

In [414]:
df_from_dict

Unnamed: 0,first,last,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


In [415]:
df_from_dict.columns = ['first name', 'last name', 'email']

In [416]:
df_from_dict.columns

Index(['first name', 'last name', 'email'], dtype='object')

In [417]:
df_from_dict

Unnamed: 0,first name,last name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


Note that this `DataFrame.columns` assignment above happens *inplace* by default.

This method of changing column names is almost never used since we rarely have to change the names of ALL the columns in our dataframes. Usually, we might want to change:
- either something specific about ALL the column names,
- or the names of SOME columns.

### Changing something specific about ALL the column names

Let's say we want to change ALL the column names to UPPER CASE, we can use a **list comprehension**.

In [418]:
df_from_dict.columns

Index(['first name', 'last name', 'email'], dtype='object')

In [419]:
df_from_dict.columns = [x.upper() for x in df_from_dict.columns]

In [420]:
df_from_dict

Unnamed: 0,FIRST NAME,LAST NAME,EMAIL
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


Note: using a **list comprehension** in this manner changes the column names *inplace*.

Now, if we want to replace all the whitepsaces (` `) in our column names with underscores (`_`), we could use the `.replace()` **string method**.

In [421]:
df_from_dict.columns = df_from_dict.columns.str.replace(' ', '_')
df_from_dict

Unnamed: 0,FIRST_NAME,LAST_NAME,EMAIL
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


Note: `.replace()` string method does not change the column names *inplace*. The string returned from `.replace()` needs to be assigned back to `DataFrame.columns`.

### Changing SOME column names: `DataFrame.rename(columns={dict})` method:

If we want to change SOME specific column names, we can use the `DataFrame.rename()` **method**, and pass in a *dictionary* for the `column=` *flag* with the old name(s) as the *key(s)*, and the new name(s) as the *value(s)*,<br> i.e. `DataFrame.rename(columns={<old name>: <new name>})`

In [422]:
df_from_dict

Unnamed: 0,FIRST_NAME,LAST_NAME,EMAIL
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


In [423]:
df_from_dict.rename(columns={'FIRST_NAME': 'first', 'LAST_NAME': 'last'}, inplace=True)

In [424]:
df_from_dict

Unnamed: 0,first,last,EMAIL
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


Note: with the `DataFrame.rename()` method, we have to use the `inplace=True` flag to implement the changes *inplace*.

## Updating data in our rows

The part of the notebook structure related to this section is:
2. Updating data in our rows
    1. Changing ALL the values in a SINGLE row
    2. Changing a SINGLE value in a SINGLE row
    3. Changing MULTIPLE values in a SINGLE row, but NOT ALL of them
    4. Some other *indexers* in *pandas*
    5. Changing a SINGLE value in MULTIPLE rows
    6. `.apply()` method
        1. `.apply()` method on `Series` objects
        2. `.apply()` method on `DataFrame` objects
        3. Summary of the `.apply()` method
    7. `.applymap()` method
    8. `.map()` method

Let us first redefine the dataframe that we will work with:

In [425]:
people = {
    "first_name": ["Nabeel", "Jane", "John"],
    "last_name": ["Malik", "Doe", "Doe"],
    "email": ["nabeel_malik@email.com", "jane_doe@email.com", "john_doe@email.com"]
}

In [426]:
df_from_dict = pd.DataFrame(people)
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


We know that we can lookup values using `.loc` and `.iloc` indexers. We will use them as a starting point for setting values as well.

Note: Using **indexers** is the only recommended way to change values in SINGLE rows in a `DataFrame`.

In [427]:
df_from_dict.loc[2]

first_name                  John
last_name                    Doe
email         john_doe@email.com
Name: 2, dtype: object

### Changing ALL the values in a SINGLE row:

Now, let's say we want to **change the entire last row of the dataframe**:

In [428]:
df_from_dict.loc[2] = ['Bob', 'Wallace', 'bob_wallace@email.com']

In [429]:
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,Bob,Wallace,bob_wallace@email.com


### Changing a SINGLE value in a SINGLE row:

If we want to **change Bob's email only**:

In [430]:
df_from_dict.loc[2, 'email'] = 'bob_wallace@gmail.com'

In [431]:
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,Bob,Wallace,bob_wallace@gmail.com


### Changing MULTIPLE values in a SINGLE row, but NOT ALL of them:

In [432]:
df_from_dict.loc[2, ['last_name', 'email']] = ['Dyson', 'bob_dyson@gmail.com']

In [433]:
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,Bob,Dyson,bob_dyson@gmail.com


### Some other indexers in pandas:

Just like `.loc` and `.iloc`, `pandas` has 2 other type of (less known and less used) indexers: `.at` and `.iat`. But I intend to use `.loc` and `.iloc` indexers exclusively.

> `.loc`: 
- Selection by label
- Can access a 1-D array of elements at the same time

> `.iloc`: 
- Selection by pabel
- Can access a 1-D array of elements at the same time

> `.at`: 
- Selection by label
- Work very similar to `.loc`, but only for scalar indexers
- Can access a 1-D array of elements at the same time
- *Advantage:* faster than `.loc`
- *Disadvantage:* Can not use arrays for indexers

> `.iat`: 
- Selection by position
- Work very similar to `.iloc`, but only for scalar indexers
- Can access a 1-D array of elements at the same time
- *Advantage:* faster than `.iloc`
- *Disadvantage:* Can not use arrays for indexers

In [434]:
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,Bob,Dyson,bob_dyson@gmail.com


In [435]:
df_from_dict.at[2, 'last_name'] = 'Dyson'

In [436]:
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,Bob,Dyson,bob_dyson@gmail.com


### Changing a SINGLE value in MULTIPLE rows:

Let us first redefine the dataframe that we will work with:

In [437]:
people = {
    "first_name": ["Nabeel", "Jane", "John"],
    "last_name": ["Malik", "Doe", "Doe"],
    "email": ["nabeel_malik@email.com", "jane_doe@email.com", "john_doe@email.com"]
}

In [438]:
df_from_dict = pd.DataFrame(people)

In [439]:
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


Now, let's say we want to **change all the last names to UPPER CASE**:

In [440]:
df_from_dict['last_name'] = df_from_dict['last_name'].str.upper()

In [441]:
df_from_dict

Unnamed: 0,first_name,last_name,email
0,Nabeel,MALIK,nabeel_malik@email.com
1,Jane,DOE,jane_doe@email.com
2,John,DOE,john_doe@email.com


Note: `.upper()` string method does not change the column values *inplace*. The string returned from `.upper()` needs to be assigned back to the `Series` object as shown above.

We will now look at the `.apply()`, `.map()`, `.applymap()` and `.replace()`.

### `.apply()` method

The `.apply()` **method** is used to *apply* a function to the values in a `Series` or a `DataFrame`object.

So, the `.apply()` **method** can be called as `pandas.Series.apply()` or `pandas.DataFrame.apply()`.

In [442]:
employees = {
    "first_name": ["Nabeel", "Jane", "John"],
    "last_name": ["Malik", "Doe", "Doe"],
    "email": ["nabeel_malik@email.com", "jane_doe@email.com", "john_doe@email.com"],
    "salary": [80000, 65000, 60000]
}

In [443]:
df_emp = pd.DataFrame(employees)

In [444]:
df_emp

Unnamed: 0,first_name,last_name,email,salary
0,Nabeel,Malik,nabeel_malik@email.com,80000
1,Jane,Doe,jane_doe@email.com,65000
2,John,Doe,john_doe@email.com,60000


#### `.apply()` method on `Series` objects:

First, let's apply the `.apply()` method on `Series` objects:

If we want to **see the character length of all employee emails** using the `len()` function:

In [445]:
df_emp['email'].apply(len)

0    22
1    18
2    18
Name: email, dtype: int64

Now, let's write a function to **update the employee salaries**:

In [446]:
def update_salary(old_salary):
    return old_salary*1.1

In [447]:
df_emp['salary'] = df_emp['salary'].apply(update_salary)

In [448]:
df_emp

Unnamed: 0,first_name,last_name,email,salary
0,Nabeel,Malik,nabeel_malik@email.com,88000.0
1,Jane,Doe,jane_doe@email.com,71500.0
2,John,Doe,john_doe@email.com,66000.0


Now, instead of separately defining a function, we could also use **lambda functions** here:

In [449]:
df_emp['salary'] = df_emp['salary'].apply(lambda x: x*1.1)

In [450]:
df_emp

Unnamed: 0,first_name,last_name,email,salary
0,Nabeel,Malik,nabeel_malik@email.com,96800.0
1,Jane,Doe,jane_doe@email.com,78650.0
2,John,Doe,john_doe@email.com,72600.0


#### `.apply()` method on `DataFrame` objects:

When applied to a `Series` object, the `.apply()` method applies to **each value** in the `Series`:

In [451]:
df_emp['email'].apply(len)

0    22
1    18
2    18
Name: email, dtype: int64

However, when applied to a `DataFrame` object, the `.apply()` method applies to **each** `Series` **in the** `DataFrame` **along the specified axis**.

The **axis** is specified with the `axis=` **flag**

> `axis=` **flag**:
- 0 (default): index: apply function to each column
- 1: columns: apply function to each row

In [452]:
df_emp.apply(len)

first_name    3
last_name     3
email         3
salary        3
dtype: int64

So, for a particular column, the above command is the same as the following command:

In [453]:
len(df_emp['first_name'])

3

Instead of applying a function to each *column*, a function could also be applied to each *row*:

In [454]:
df_emp.apply(len, axis='columns')

0    4
1    4
2    4
dtype: int64

So basically, we want to use `.apply()` on a `DataFrame` with *functions* that make sense to apply on a `Series` object.

So, for instance, let's say that we want to use `.apply()` on a `DataFrame` to grab the **minimum value in each column**, we could do that passing the `pandas.Series.min()` method to the `.apply()` method:

In [455]:
df_emp

Unnamed: 0,first_name,last_name,email,salary
0,Nabeel,Malik,nabeel_malik@email.com,96800.0
1,Jane,Doe,jane_doe@email.com,78650.0
2,John,Doe,john_doe@email.com,72600.0


In [456]:
df_emp.apply(pd.Series.min)

first_name                  Jane
last_name                    Doe
email         jane_doe@email.com
salary                     72600
dtype: object

The `pandas.Series.min()` function makes more sense for numerical data, but the minimum value for *strings* would just be the minimum value in the alphabetical order.

We can use **lambda functions** inside `.apply()` for `'DataFrame` objects as well, just remembering that the *lambda function* will be working on the `Series` objects in the `DataFrame`:

In [457]:
df_emp.apply(lambda x: x.max())

first_name                    Nabeel
last_name                      Malik
email         nabeel_malik@email.com
salary                         96800
dtype: object

#### Summary of the `.apply()` method:

> Running `.apply()` on a `Series` applies a function to every *value* in the `Series`,

> Running `.apply()` on a `DataFrame` applies a function to every `Series` in the `DataFrame`.

Now, you might be wondering if there is a way to apply a function to every individual value in a `DataFrame`.

That is what `.applymap()` is used for.

### `.applymap()` method

The `.applymap()` method only works on `DataFrame` objects. `Series` objects do not have the `.applymap()` method.

In [458]:
df_emp

Unnamed: 0,first_name,last_name,email,salary
0,Nabeel,Malik,nabeel_malik@email.com,96800.0
1,Jane,Doe,jane_doe@email.com,78650.0
2,John,Doe,john_doe@email.com,72600.0


Let's just delete the 'salary' column to be able to apply the `len()` function to this `DataFrame`:

In [459]:
del df_emp['salary']

In [460]:
df_emp

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


In [461]:
df_emp.applymap(len)

Unnamed: 0,first_name,last_name,email
0,6,5,22
1,4,3,18
2,4,3,18


In [462]:
df_emp.applymap(str.lower)

Unnamed: 0,first_name,last_name,email
0,nabeel,malik,nabeel_malik@email.com
1,jane,doe,jane_doe@email.com
2,john,doe,john_doe@email.com


### `.map()` method:

The `.map()` method only works with `Series` objects.

`.map()` is used for substituting each *value* in a `Series` with another *value*.

In [463]:
df_emp

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


In [464]:
df_emp['first_name'].map({'Nabeel': 'Omar', 'John': 'Chris'})

0     Omar
1      NaN
2    Chris
Name: first_name, dtype: object

Note: The *values* that we did not substitute with the `.map()` method, were converted to `NaN` (Not A Number) *values*.

What if we wanted to keep the other names, and just substitute the names provided in the *dictionary* to the `.map()` method. For that, we could use the `.replace()` method.

In [465]:
df_emp

Unnamed: 0,first_name,last_name,email
0,Nabeel,Malik,nabeel_malik@email.com
1,Jane,Doe,jane_doe@email.com
2,John,Doe,john_doe@email.com


In [466]:
df_emp['first_name'].replace({'Nabeel': 'Omar', 'John': 'Chris'})

0     Omar
1     Jane
2    Chris
Name: first_name, dtype: object

Important note for `.apply()`, `.applymap()` and `.map()` methods:

- These methods DO NOT apply changes to the `DataFrame` or `Series` objects *inplace*;
- They DO NOT have an `inplace=` flag;
- They just return the modified `DataFrame` or `Series` objects that need to be reassigned.

***

## Now, let's go back to the stack overflow dataset and work with it

In [467]:
df = pd.read_csv('work_directory/pandas/data/survey_results_public.csv', index_col='Respondent')
df_schema = pd.read_csv('work_directory/pandas/data/survey_results_schema.csv', index_col='Column')

In [468]:
pd.set_option('display.max_rows', 85)
pd.set_option('display.max_columns', 85)

In [469]:
df.head()

Unnamed: 0_level_0,MainBranch,Hobbyist,OpenSourcer,OpenSource,Employment,Country,Student,EdLevel,UndergradMajor,EduOther,OrgSize,DevType,YearsCode,Age1stCode,YearsCodePro,CareerSat,JobSat,MgrIdiot,MgrMoney,MgrWant,JobSeek,LastHireDate,LastInt,FizzBuzz,JobFactors,ResumeUpdate,CurrencySymbol,CurrencyDesc,CompTotal,CompFreq,ConvertedComp,WorkWeekHrs,WorkPlan,WorkChallenge,WorkRemote,WorkLoc,ImpSyn,CodeRev,CodeRevHrs,UnitTests,PurchaseHow,PurchaseWhat,LanguageWorkedWith,LanguageDesireNextYear,DatabaseWorkedWith,DatabaseDesireNextYear,PlatformWorkedWith,PlatformDesireNextYear,WebFrameWorkedWith,WebFrameDesireNextYear,MiscTechWorkedWith,MiscTechDesireNextYear,DevEnviron,OpSys,Containers,BlockchainOrg,BlockchainIs,BetterLife,ITperson,OffOn,SocialMedia,Extraversion,ScreenName,SOVisit1st,SOVisitFreq,SOVisitTo,SOFindAnswer,SOTimeSaved,SOHowMuchTime,SOAccount,SOPartFreq,SOJobs,EntTeams,SOComm,WelcomeChange,SONewContent,Age,Gender,Trans,Sexuality,Ethnicity,Dependents,SurveyLength,SurveyEase
Respondent,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1
1,I am a student who is learning to code,Yes,Never,The quality of OSS and closed source software ...,"Not employed, and not looking for work",United Kingdom,No,Primary/elementary school,,"Taught yourself a new language, framework, or ...",,,4.0,10,,,,,,,,,,,,,,,,,,,,,,,,,,,,,HTML/CSS;Java;JavaScript;Python,C;C++;C#;Go;HTML/CSS;Java;JavaScript;Python;SQL,SQLite,MySQL,MacOS;Windows,Android;Arduino;Windows,Django;Flask,Flask;jQuery,Node.js,Node.js,IntelliJ;Notepad++;PyCharm,Windows,I do not use containers,,,Yes,"Fortunately, someone else has that title",Yes,Twitter,Online,Username,2017,A few times per month or weekly,Find answers to specific questions;Learn how t...,3-5 times per week,Stack Overflow was much faster,31-60 minutes,No,,"No, I didn't know that Stack Overflow had a jo...","No, and I don't know what those are",Neutral,Just as welcome now as I felt last year,Tech articles written by other developers;Indu...,14.0,Man,No,Straight / Heterosexual,,No,Appropriate in length,Neither easy nor difficult
2,I am a student who is learning to code,No,Less than once per year,The quality of OSS and closed source software ...,"Not employed, but looking for work",Bosnia and Herzegovina,"Yes, full-time","Secondary school (e.g. American high school, G...",,Taken an online course in programming or softw...,,"Developer, desktop or enterprise applications;...",,17,,,,,,,I am actively looking for a job,I've never had a job,,,Financial performance or funding status of the...,"Something else changed (education, award, medi...",,,,,,,,,,,,,,,,,C++;HTML/CSS;Python,C++;HTML/CSS;JavaScript;SQL,,MySQL,Windows,Windows,Django,Django,,,Atom;PyCharm,Windows,I do not use containers,,Useful across many domains and could change ma...,Yes,Yes,Yes,Instagram,Online,Username,2017,Daily or almost daily,Find answers to specific questions;Learn how t...,3-5 times per week,Stack Overflow was much faster,11-30 minutes,Yes,A few times per month or weekly,"No, I knew that Stack Overflow had a job board...","No, and I don't know what those are","Yes, somewhat",Just as welcome now as I felt last year,Tech articles written by other developers;Indu...,19.0,Man,No,Straight / Heterosexual,,No,Appropriate in length,Neither easy nor difficult
3,"I am not primarily a developer, but I write co...",Yes,Never,The quality of OSS and closed source software ...,Employed full-time,Thailand,No,"Bachelor’s degree (BA, BS, B.Eng., etc.)",Web development or web design,"Taught yourself a new language, framework, or ...",100 to 499 employees,"Designer;Developer, back-end;Developer, front-...",3.0,22,1,Slightly satisfied,Slightly satisfied,Not at all confident,Not sure,Not sure,"I’m not actively looking, but I am open to new...",1-2 years ago,Interview with people in peer roles,No,"Languages, frameworks, and other technologies ...",I was preparing for a job search,THB,Thai baht,23000.0,Monthly,8820.0,40.0,There's no schedule or spec; I work on what se...,Distracting work environment;Inadequate access...,Less than once per month / Never,Home,Average,No,,"No, but I think we should",Not sure,I have little or no influence,HTML/CSS,Elixir;HTML/CSS,PostgreSQL,PostgreSQL,,,,Other(s):,,,Vim;Visual Studio Code,Linux-based,I do not use containers,,,Yes,Yes,Yes,Reddit,In real life (in person),Username,2011,A few times per week,Find answers to specific questions;Learn how t...,6-10 times per week,They were about the same,,Yes,Less than once per month or monthly,Yes,"No, I've heard of them, but I am not part of a...",Neutral,Just as welcome now as I felt last year,Tech meetups or events in your area;Courses on...,28.0,Man,No,Straight / Heterosexual,,Yes,Appropriate in length,Neither easy nor difficult
4,I am a developer by profession,No,Never,The quality of OSS and closed source software ...,Employed full-time,United States,No,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Taken an online course in programming or softw...,100 to 499 employees,"Developer, full-stack",3.0,16,Less than 1 year,Very satisfied,Slightly satisfied,Very confident,No,Not sure,I am not interested in new job opportunities,Less than a year ago,"Write code by hand (e.g., on a whiteboard);Int...",No,"Languages, frameworks, and other technologies ...",I was preparing for a job search,USD,United States dollar,61000.0,Yearly,61000.0,80.0,There's no schedule or spec; I work on what se...,,Less than once per month / Never,Home,A little below average,No,,"No, but I think we should",Developers typically have the most influence o...,I have little or no influence,C;C++;C#;Python;SQL,C;C#;JavaScript;SQL,MySQL;SQLite,MySQL;SQLite,Linux;Windows,Linux;Windows,,,.NET,.NET,Eclipse;Vim;Visual Studio;Visual Studio Code,Windows,I do not use containers,Not at all,"Useful for decentralized currency (i.e., Bitcoin)",Yes,SIGH,Yes,Reddit,In real life (in person),Username,2014,Daily or almost daily,Find answers to specific questions;Pass the ti...,1-2 times per week,Stack Overflow was much faster,31-60 minutes,Yes,Less than once per month or monthly,Yes,"No, and I don't know what those are","No, not really",Just as welcome now as I felt last year,Tech articles written by other developers;Indu...,22.0,Man,No,Straight / Heterosexual,White or of European descent,No,Appropriate in length,Easy
5,I am a developer by profession,Yes,Once a month or more often,"OSS is, on average, of HIGHER quality than pro...",Employed full-time,Ukraine,No,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Taken an online course in programming or softw...,"10,000 or more employees","Academic researcher;Developer, desktop or ente...",16.0,14,9,Very dissatisfied,Slightly dissatisfied,Somewhat confident,Yes,No,I am not interested in new job opportunities,Less than a year ago,"Write any code;Write code by hand (e.g., on a ...",No,"Industry that I'd be working in;Languages, fra...",I was preparing for a job search,UAH,Ukrainian hryvnia,,,,55.0,There is a schedule and/or spec (made by me or...,Being tasked with non-development work;Inadequ...,A few days each month,Office,A little above average,"Yes, because I see value in code review",,"Yes, it's part of our process",Not sure,I have little or no influence,C++;HTML/CSS;Java;JavaScript;Python;SQL;VBA,HTML/CSS;Java;JavaScript;SQL;WebAssembly,Couchbase;MongoDB;MySQL;Oracle;PostgreSQL;SQLite,Couchbase;Firebase;MongoDB;MySQL;Oracle;Postgr...,Android;Linux;MacOS;Slack;Windows,Android;Docker;Kubernetes;Linux;Slack,Django;Express;Flask;jQuery;React.js;Spring,Flask;jQuery;React.js;Spring,Cordova;Node.js,Apache Spark;Hadoop;Node.js;React Native,IntelliJ;Notepad++;Vim,Linux-based,"Outside of work, for personal projects",Not at all,,Yes,Also Yes,Yes,Facebook,In real life (in person),Username,I don't remember,Multiple times per day,Find answers to specific questions,More than 10 times per week,Stack Overflow was much faster,,Yes,A few times per month or weekly,"No, I knew that Stack Overflow had a job board...","No, I've heard of them, but I am not part of a...","Yes, definitely",Just as welcome now as I felt last year,Tech meetups or events in your area;Courses on...,30.0,Man,No,Straight / Heterosexual,White or of European descent;Multiracial,No,Appropriate in length,Easy


Now, let's say we want to rename the *'ConvertedComp'* column name to *'SalaryUSD'*:

In [470]:
df.rename(columns={'ConvertedComp': 'SalaryUSD'}, inplace=True)

In [471]:
df['SalaryUSD']

Respondent
1            NaN
2            NaN
3         8820.0
4        61000.0
5            NaN
          ...   
996          NaN
997      71966.0
998          NaN
999     106250.0
1000    318864.0
Name: SalaryUSD, Length: 997, dtype: float64

So that seems to have worked.

Now, let's say that we want to take the *'Hobbyist'* column, and replace all the 
- *'Yes'* values with `True`
- *'No'* values with `False`

In [472]:
df['Hobbyist'] = df['Hobbyist'].map({'Yes': True, 'No': False})

In [473]:
df['Hobbyist']

Respondent
1        True
2       False
3        True
4       False
5        True
        ...  
996      True
997     False
998      True
999      True
1000     True
Name: Hobbyist, Length: 997, dtype: bool

Note: In case we had values other than *'Yes'* and *'No'* in this column, and we wanted to keep those values untouched, we could have used the `.replace()` method instead of `.map()`.