## <span style="color:black"><u>Manipulating a series in Pandas</u></span>

* This video will investigate how we can manipulate a pandas series object in a sophisticated manner using `replace()`, `apply()` and `map()`
* In doing so, we will introduce [lambda functions](https://towardsdatascience.com/lambda-functions-with-practical-examples-in-python-45934f3653a8), a useful python concept that works really nicely here
* Data can be messy, so hopefully this video helps in cases where you come accross a dataset where one or more series' needs to be changed/cleaned in some sort of way
* The table below shows where each function can be used


| Method     | Series | DataFrame |
|------------|--------|-----------|
| replace()  | ✅     | ✅        |
| apply()    | ✅     | ✅        |   
| applymap() | -      | ✅        |   
| map()      | ✅     | -         |   


In [1]:
import re
import numpy as np
import pandas as pd
import nums_from_string
from forex_python.converter import CurrencyRates

# Show all columns
pd.options.display.max_columns = None

In [2]:
df = pd.read_csv("survey.csv", sep = ';')
df.head()

Unnamed: 0,Respondent,MainBranch,Hobbyist,Age,Age1stCode,CompFreq,CompTotal,Country,CurrencySymbol,DatabaseDesireNextYear,DatabaseWorkedWith,DevType,EdLevel,Employment,Ethnicity,Gender,JobFactors,JobSat,JobSeek,LanguageDesireNextYear,LanguageWorkedWith,MiscTechDesireNextYear,MiscTechWorkedWith,NEWCollabToolsDesireNextYear,NEWCollabToolsWorkedWith,NEWDevOps,NEWDevOpsImpt,NEWEdImpt,NEWJobHunt,NEWJobHuntResearch,NEWLearn,NEWOffTopic,NEWOnboardGood,NEWOtherComms,NEWOvertime,NEWPurchaseResearch,NEWPurpleLink,NEWSOSites,NEWStuck,OpSys,OrgSize,PlatformDesireNextYear,PlatformWorkedWith,PurchaseWhat,Sexuality,SOAccount,SOComm,SOPartFreq,SOVisitFreq,SurveyEase,SurveyLength,Trans,UndergradMajor,WebframeDesireNextYear,WebframeWorkedWith,WelcomeChange,WorkWeekHrs,YearsCode,YearsCodePro
0,42,I am a developer by profession,no,32.0,14,Yearly,130000.0 USD,United States,USD,Cassandra;DynamoDB;MongoDB,Microsoft SQL Server;MySQL;Oracle;Redis,"Developer, back-end;Developer, desktop or ente...","Bachelor’s degree (B.A., B.S., B.Eng., etc.)",Employed full-time,White or of European descent,Man,Remote work options;How widely used or impactf...,Very satisfied,"I’m not actively looking, but I am open to new...",Bash/Shell/PowerShell;C#;Go;Java;Python;Rust,Bash/Shell/PowerShell;C#;HTML/CSS;Java;JavaScr...,.NET Core;Teraform,.NET;.NET Core;Node.js;Teraform,Confluence;Jira;Gitlab,Confluence;Jira;Github;Gitlab;Slack;Trello;Goo...,Yes,Somewhat important,Not at all important/not necessary,Curious about other opportunities;Better compe...,"Read company media, such as employee blogs or ...",Once every few years,Yes,Yes,Yes,Rarely: 1-2 days per year or less,Start a free trial;Ask developers I know/work ...,Amused,Stack Overflow (public Q&A for anyone who code...,Visit Stack Overflow;Go for a walk or other ph...,Windows,"1,000 to 4,999 employees",Android;AWS;Docker;Kubernetes;Raspberry Pi,AWS;Docker;Linux;MacOS;Raspberry Pi;Slack Apps...,I have some influence,Straight / Heterosexual,Yes,"No, not really",Less than once per month or monthly,A few times per month or weekly,Easy,Appropriate in length,No,"Computer science, computer engineering, or sof...",ASP.NET Core;Spring,ASP.NET;Flask;React.js;Spring,Somewhat less welcome now than last year,37.0,16,10
1,80,I am a developer by profession,Yes,32.0,11,Yearly,94500.0 USD,United States,USD,Microsoft SQL Server;MySQL,Elasticsearch;Microsoft SQL Server;MySQL,DevOps specialist,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",Employed full-time,Hispanic or Latino/a/x;White or of European de...,Man,Flex time or a flexible schedule;Specific depa...,Very satisfied,"I’m not actively looking, but I am open to new...",Bash/Shell/PowerShell;C#;Dart;Go;TypeScript,Bash/Shell/PowerShell;C#;HTML/CSS;JavaScript;P...,.NET Core;Ansible;Teraform,.NET Core;Ansible,Confluence;Github;Gitlab;Microsoft Teams;Micro...,Confluence;Github;Gitlab;Microsoft Azure,Yes,Extremely important,Fairly important,Wanting to share accomplishments with a wider ...,Personal network - friends or family,Once a year,Not sure,No,No,Occasionally: 1-2 days per quarter but less th...,Start a free trial;Ask developers I know/work ...,"Hello, old friend",Stack Overflow (public Q&A for anyone who code...,Call a coworker or friend;Visit Stack Overflow...,Windows,"10,000 or more employees",Docker;Kubernetes;Linux;Microsoft Azure;Windows,Docker;Linux;Microsoft Azure;Windows,I have some influence,Straight / Heterosexual,Yes,"Yes, somewhat",Less than once per month or monthly,Multiple times per day,Easy,Too long,No,"Information systems, information technology, o...",Angular;ASP.NET Core,Angular;ASP.NET;ASP.NET Core,Just as welcome now as I felt last year,40.0,21,7
2,119,I am a developer by profession,Yes,35.0,15,Yearly,145000.0 USD,United States,USD,Redis,DynamoDB;MariaDB;PostgreSQL,"Developer, back-end;Developer, front-end;Devel...","Bachelor’s degree (B.A., B.S., B.Eng., etc.)",Employed full-time,White or of European descent,Man,Flex time or a flexible schedule;How widely us...,Very satisfied,"I’m not actively looking, but I am open to new...",Bash/Shell/PowerShell;C#;HTML/CSS;Python;Rust;...,Bash/Shell/PowerShell;C#;Go;HTML/CSS;Java;Java...,.NET Core;Teraform,.NET Core;Node.js;Teraform;Unity 3D;Unreal Engine,Github;Microsoft Azure,Confluence;Jira;Github;Gitlab;Microsoft Azure;...,Yes,Extremely important,Very important,Better compensation;Trouble with leadership at...,"Read company media, such as employee blogs or ...",Once every few years,No,Yes,Yes,Often: 1-2 days per week or more,Ask developers I know/work with,Annoyed,Stack Overflow (public Q&A for anyone who code...,Meditate;Call a coworker or friend;Visit Stack...,Windows,2 to 9 employees,Docker;Kubernetes;Linux;MacOS;Microsoft Azure;...,Android;AWS;Docker;iOS;Kubernetes;Linux;MacOS;...,I have some influence,Straight / Heterosexual,Yes,"Yes, somewhat",A few times per week,Multiple times per day,Easy,Appropriate in length,No,"Another engineering discipline (such as civil,...",ASP.NET Core;Gatsby;React.js,ASP.NET Core;React.js,Just as welcome now as I felt last year,50.0,16,16
3,131,I am a developer by profession,Yes,37.0,11,Yearly,115000.0 USD,United States,USD,MongoDB;PostgreSQL;Redis,MySQL,"Developer, full-stack","Bachelor’s degree (B.A., B.S., B.Eng., etc.)",Employed full-time,White or of European descent,Man,"Industry that I’d be working in;Languages, fra...",Slightly satisfied,"I’m not actively looking, but I am open to new...",Bash/Shell/PowerShell;Julia;Rust;TypeScript,Bash/Shell/PowerShell;C#;HTML/CSS;Java;JavaScr...,.NET Core,.NET;.NET Core;Node.js,Github;Slack,Confluence;Jira;Github;Slack,No,Extremely important,Somewhat important,Better compensation;Trouble with my teammates;...,"Read company media, such as employee blogs or ...",Every few months,No,Yes,Yes,Rarely: 1-2 days per year or less,Start a free trial;Visit developer communities...,"Hello, old friend",Stack Overflow (public Q&A for anyone who code...,Visit Stack Overflow;Go for a walk or other ph...,Windows,10 to 19 employees,Arduino;AWS;Docker;Raspberry Pi;Windows,AWS;Docker;Linux;Raspberry Pi;Windows,I have some influence,Straight / Heterosexual,Yes,"No, not at all",I have never participated in Q&A on Stack Over...,Daily or almost daily,Easy,Appropriate in length,No,Mathematics or statistics,React.js;Vue.js,Vue.js,Just as welcome now as I felt last year,40.0,5,3
4,158,I am a developer by profession,Yes,40.0,18,Yearly,170000.0 USD,United States,USD,Elasticsearch;Firebase;PostgreSQL;Redis,Microsoft SQL Server;MySQL;PostgreSQL;Redis,"Developer, back-end;Developer, front-end;Devel...","Bachelor’s degree (B.A., B.S., B.Eng., etc.)",Employed full-time,Hispanic or Latino/a/x,Man,"Flex time or a flexible schedule;Languages, fr...",Slightly satisfied,"I’m not actively looking, but I am open to new...",Bash/Shell/PowerShell;Go;HTML/CSS;JavaScript;P...,Bash/Shell/PowerShell;HTML/CSS;JavaScript;Pyth...,Ansible;Flutter;Hadoop;React Native;TensorFlow...,Ansible;React Native;TensorFlow,Jira;Github;Slack;Microsoft Azure;Google Suite...,Jira;Github;Gitlab;Slack;Microsoft Azure;Googl...,No,Extremely important,Very important,Curious about other opportunities;Better compe...,Company reviews from third party sites (e.g. G...,Once a year,No,No,Yes,Occasionally: 1-2 days per quarter but less th...,Start a free trial,Annoyed,Stack Overflow (public Q&A for anyone who codes),Call a coworker or friend;Visit Stack Overflow...,MacOS,500 to 999 employees,Arduino;AWS;Docker;Heroku;Linux;MacOS;Microsof...,AWS;Docker;Heroku;Linux;MacOS;Microsoft Azure,I have a great deal of influence,Straight / Heterosexual,Yes,"Yes, definitely",A few times per month or weekly,Daily or almost daily,Easy,Appropriate in length,No,"Computer science, computer engineering, or sof...",Gatsby;React.js;Ruby on Rails;Vue.js,React.js;Ruby on Rails,Just as welcome now as I felt last year,35.0,22,18


In [None]:
to_drop = ['Age1stCode', 'DatabaseDesireNextYear', 'DatabaseWorkedWith', 'DevType', 'EdLevel', 'Ethnicity', 'JobFactors']



df.drop(to_drop, axis = 1, inplace = True)

In [None]:
df.to_csv("survey.csv", sep = ';', index = False)

In [None]:
df.info()

* In the world of machine learning, our data will very often need to be in numeric form. So how can we execute this?
* For now, lets just see how we might want to convert a column with binary outcomes into zeroes and ones

---

#### 🧠 Exercise: 
* Change the 'Hobbyist' series from yes/no to 1/0 using the `map()` series method
* Change the 'Gender' series from Female/Male to 1/0 using the `replace()` method
* Change the 'NEWOnboardGood' series from yes/no to 1/0 using the `apply()` function


<u>Using Map</u>

In [None]:
# See all unique values in the series
df.Hobbyist.unique()

# Set to lowercase only
df['Hobbyist'] = df['Hobbyist'].str.lower()

In [None]:
# Use map()
df['Hobbyist'] = df['Hobbyist'].map({'no':0, 'yes':1})

<u>Using Replace</u>

In [None]:
df['Gender'] = df.Gender.replace('Woman', 1).replace('Man', 0)

In [None]:
# rename to 'Woman'
df.rename(columns = {'Gender':'Woman'}, inplace = True)

<u>Using Apply & Lambda Functions</u>

In [None]:
df['NEWOnboardGood'] = df['NEWOnboardGood'].apply(lambda x: 1 if x == 'Yes' else 0)

#### 🧠 Exercise: 
* Do the same thing with the 'NEWOtherComms' series but this time demonstrate how it can be done using your own custom function

In [None]:
df['NEWOtherComms'].unique()

In [None]:
def do_conversion(mystring):
    """Converts 'Yes' to 1 and 'No' to 0"""
    if mystring == 'Yes':
        return 1
    else:
        return 0

In [None]:
df['NEWOtherComms'] = df['NEWOtherComms'].apply(do_conversion)

#### 🧠 Exercise: 
* Take the 'OrgSize' series and convert it completely into numeric form

In [None]:
def string_to_num(mystring):
    """Takes all numbers out of the string and places them in a list, taking the average afterwards"""
    number_list = nums_from_string.get_nums(mystring)
    avg = np.mean(number_list)
    return int(round(avg, 0))
    

In [None]:
# Apply the function
df['OrgSize'] = df['OrgSize'].apply(string_to_num)

#### 🧠 Exercise: 
* Adjust the 'JobSat' series to return a satisfaction score (use any method you like)

In [None]:
df.JobSat.unique()

In [None]:
mapping_vals = {'Very satisfied':9, 
                'Slightly satisfied': 6,
                'Very dissatisfied': 2,
                'Neither satisfied nor dissatisfied': 5,
                'Slightly dissatisfied': 4}

df['JobSat'] = df['JobSat'].map(mapping_vals)

#### 🧠 Exercise: 
* Take the 'CompTotal' series and convert into Australian Dollars
* First we will do a basic example and then a more complicated one

https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6

In [None]:
def convert_to_aus(mystring):
    """Converts to Aussie dollars using the current exchange rate"""
    # Split the series into the number part and the string part
    # =========================================================
    numeric_val = float(mystring.split(' ')[0])
    currency_type = mystring.split(' ')[1]

    if 'usd' in currency_type.lower():
        return numeric_val * 1.38
    elif 'gbp' in currency_type.lower():
        return numeric_val * 1.89
    else: 
        return numeric_val    

In [None]:
df['CompTotal_Test_1'] = df['CompTotal'].apply(convert_to_aus)

In [None]:
c = CurrencyRates()

def convert_rates2(amount, base_currency, amount):
    if base_currency != 'AUD':
        return c.convert(base_cur = base_currency, dest_cur = 'AUD', amount = amount)
    else:
        return amount

In [None]:
df['Amount, DKK'] = np.vectorize(convert_rates2)(amount = df['CompTotal'], currency=df['Crcy'], PstngDate=df['Pstng Date'])

In [None]:
df

In [None]:
df

In [None]:
df['CompTotal'] = df['CompTotal'].apply(convert_to_aus)

Getting real time exchange rates is beyond the scope of this pandas tutorial but here is how we can get [real time currency updates](https://forex-python.readthedocs.io/en/latest/usage.html)

End of video

---

In [None]:
# Not really related to this tutorial but here is some really nice code to extract things that are not numbers from a string (using regex)
import re
string1 = "word1 000 word2"

pattern = r'[0-9]'

# Match all digits in the string and replace them with an empty string
new_string = re.sub(pattern, '', string1)

print(new_string)

In [None]:
from forex_python.converter import CurrencyRates

c = CurrencyRates()

print(c.get_rate('USD', 'GBP'))