# Job search of September 2023

## Context

This is my first "real" job search, as it coincides with the end of my engineering school.

I actually started looking for a first position in May 2023, during the middle of my end-of-study internship, but it was not successful.
After my summer holidays, I decided to start searching again, but this time with a better organization to keep track of my application and recruitment processes.

The data in this analysis is only focused on my adventure beginning in September 2023. Previous data is not considered due to inconsistent tracking and formatting.

My goal was to find a junior position in the field of data science or machine learning.

### Constraints

#### Targeted positions

- Data Scientist
- Data Engineer
- Data Analyst
- Machine Learning Engineer

#### Localization

 - France (on-site, hybrid, remote)
 - Europe (remote)

## JSA analysis

### Imports

In [106]:
%matplotlib inline

import pandas as pd
from plotly import express as px

### Load the data

In [107]:
df_jsa_10_2023 = pd.read_csv("../data/raw/10_2023.csv", dtype=str)

df_jsa_10_2023.head()

Unnamed: 0,Status,Company,Role,Location,Source,Attendance,Application,Type,Phone call,1st interview,2nd interview,3rd interview,Proposition,Final answer,URL
0,Rejected,Orange,Data Scientist,Rennes,Carrer Site,,30/08/2023,Offer,13/10/2023,,,,,26/10/2023,https://orange.jobs/jobs/v3/offers/127229?lang=
1,Applied,Orange,Data Scientist,Toulouse,Carrer Site,,30/08/2023,Offer,,,,,,,https://orange.jobs/jobs/v3/offers/128950?lang=
2,Applied,AZ Consulting,Data Scientist,,Recruitment Agency,,31/08/2023,Spontaneous,,,,,,,https://az-recrutement.wixsite.com/az-recrutement
3,No project for now,ALTEN,,Marignane,Reach-out,Hybrid,,,05/09/2023,06/09/2023 17:00,,,,06/09/2023,
4,No project for now,Astek,Data Scientist,,LinkedIn,,31/08/2023,Spontaneous,01/09/2023,07/09/2023 18:00,,,,07/09/2023,https://www.linkedin.com/posts/killian-vincent...


In [108]:
df_jsa_10_2023.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 447 entries, 0 to 446
Data columns (total 15 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   Status         447 non-null    object
 1   Company        444 non-null    object
 2   Role           441 non-null    object
 3   Location       419 non-null    object
 4   Source         446 non-null    object
 5   Attendance     210 non-null    object
 6   Application    437 non-null    object
 7   Type           436 non-null    object
 8   Phone call     15 non-null     object
 9   1st interview  17 non-null     object
 10  2nd interview  4 non-null      object
 11  3rd interview  1 non-null      object
 12  Proposition    4 non-null      object
 13  Final answer   146 non-null    object
 14  URL            411 non-null    object
dtypes: object(15)
memory usage: 52.5+ KB


### Format the data

In [109]:
df_jsa_10_2023_f = df_jsa_10_2023.copy()

#### Datetime

In [110]:
dates_col = ["Application",
             "Phone call",
             "1st interview",
             "2nd interview",
             "3rd interview",
             "Proposition",
             "Final answer"]

# Convert to datetime
df_jsa_10_2023_f[dates_col] = df_jsa_10_2023_f[dates_col].apply(pd.to_datetime, format="mixed")
# Remove time
df_jsa_10_2023_f[dates_col] = df_jsa_10_2023_f[dates_col].apply(lambda x : x.dt.normalize())

df_jsa_10_2023_f[dates_col].head()

Unnamed: 0,Application,Phone call,1st interview,2nd interview,3rd interview,Proposition,Final answer
0,2023-08-30,2023-10-13,NaT,NaT,NaT,NaT,2023-10-26
1,2023-08-30,NaT,NaT,NaT,NaT,NaT,NaT
2,2023-08-31,NaT,NaT,NaT,NaT,NaT,NaT
3,NaT,2023-05-09,2023-06-09,NaT,NaT,NaT,2023-06-09
4,2023-08-31,2023-01-09,2023-07-09,NaT,NaT,NaT,2023-07-09


#### Categories

In [111]:
categories_col = ["Status", "Location", "Role", "Source", "Attendance", "Type"]

df_jsa_10_2023_f[categories_col] = df_jsa_10_2023_f[categories_col].fillna("Not specified")

df_jsa_10_2023_f[categories_col].head()

Unnamed: 0,Status,Location,Role,Source,Attendance,Type
0,Rejected,Rennes,Data Scientist,Carrer Site,Not specified,Offer
1,Applied,Toulouse,Data Scientist,Carrer Site,Not specified,Offer
2,Applied,Not specified,Data Scientist,Recruitment Agency,Not specified,Spontaneous
3,No project for now,Marignane,Not specified,Reach-out,Hybrid,Not specified
4,No project for now,Not specified,Data Scientist,LinkedIn,Not specified,Spontaneous


### Save processed data

In [112]:
df_jsa_10_2023_f.head()

Unnamed: 0,Status,Company,Role,Location,Source,Attendance,Application,Type,Phone call,1st interview,2nd interview,3rd interview,Proposition,Final answer,URL
0,Rejected,Orange,Data Scientist,Rennes,Carrer Site,Not specified,2023-08-30,Offer,2023-10-13,NaT,NaT,NaT,NaT,2023-10-26,https://orange.jobs/jobs/v3/offers/127229?lang=
1,Applied,Orange,Data Scientist,Toulouse,Carrer Site,Not specified,2023-08-30,Offer,NaT,NaT,NaT,NaT,NaT,NaT,https://orange.jobs/jobs/v3/offers/128950?lang=
2,Applied,AZ Consulting,Data Scientist,Not specified,Recruitment Agency,Not specified,2023-08-31,Spontaneous,NaT,NaT,NaT,NaT,NaT,NaT,https://az-recrutement.wixsite.com/az-recrutement
3,No project for now,ALTEN,Not specified,Marignane,Reach-out,Hybrid,NaT,Not specified,2023-05-09,2023-06-09,NaT,NaT,NaT,2023-06-09,
4,No project for now,Astek,Data Scientist,Not specified,LinkedIn,Not specified,2023-08-31,Spontaneous,2023-01-09,2023-07-09,NaT,NaT,NaT,2023-07-09,https://www.linkedin.com/posts/killian-vincent...


In [113]:
df_jsa_10_2023_f.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 447 entries, 0 to 446
Data columns (total 15 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   Status         447 non-null    object        
 1   Company        444 non-null    object        
 2   Role           447 non-null    object        
 3   Location       447 non-null    object        
 4   Source         447 non-null    object        
 5   Attendance     447 non-null    object        
 6   Application    437 non-null    datetime64[ns]
 7   Type           447 non-null    object        
 8   Phone call     15 non-null     datetime64[ns]
 9   1st interview  17 non-null     datetime64[ns]
 10  2nd interview  4 non-null      datetime64[ns]
 11  3rd interview  1 non-null      datetime64[ns]
 12  Proposition    4 non-null      datetime64[ns]
 13  Final answer   146 non-null    datetime64[ns]
 14  URL            411 non-null    object        
dtypes: datetime64[ns](7), o

In [114]:
df_jsa_10_2023_f.to_csv("../data/processed/10_2023.csv", index=False)