# Pipeline to Conclusion
> ##          Acquiring Related Market Data --> Analysis --> Inferences --> Expanding Information for Future Use and Consistency


### Web Scrape for the Data

I've used some packages and a little html knowledge to get this data.
My sources consist of
- for the global and Usa market are : builtin.com and 500startups 
- for the Turkey market are : startup borsa and startup market

You can see the related snippets below for imports of used packages and dataframes


In [140]:
#Related Imports
import pandas as pd
from fuzzywuzzy import fuzz
from fuzzywuzzy import process

#### Turkey Marketplace


In [141]:
df_smarket_hr = pd.read_csv(r'csv/hr_startupmarket.csv')
df_smarket_hr = df_smarket_hr.drop(['web-scraper-order', 'web-scraper-start-url'], axis=1)

df_sborsa_all = pd.read_csv(r'csv/startup_borsa.csv')
df_sborsa_all = df_sborsa_all.drop(['web-scraper-order', 'web-scraper-start-url'], axis=1)

#filtered
df_sborsa_hr = df_sborsa_all[(df_sborsa_all.sector == 'İnsan Kaynakları & İşe Alım')]

#### Global Marketplace includes Turkey


In [142]:
df_500_hr = pd.read_csv(r'csv/HR_500startup.csv')
df_500_hr = df_500_hr.rename({'name':'title'}, axis=1)
df_500_hr = df_500_hr.drop(['web-scraper-order', 'web-scraper-start-url'], axis=1)

#filtered
df_500_hr_turkey = df_500_hr[(df_500_hr.location == 'Turkey')]

#### USA Marketplace


In [143]:
df_builtin_hr = pd.read_csv(r'csv/builtin_hr.csv')
df_builtin_hr = df_builtin_hr.drop(['web-scraper-order', 'web-scraper-start-url'], axis=1)

### I have covered the general tasks of HR for subsector classification and how they can be extended. While these tasks vary from source to source, these are the common ones on many:
- Recruiting and staffing employees
- Employee benefits
- Employee compensation
- Employee and labor relations
- Human resources compliance
- Organizational structure
- Human resources information and payroll
- Employee training and development


#### After getting involved with sector columns in our data I can say that

In [144]:
#Subsections on 500Startup
subsections = df_500_hr['sector'].unique()

### Merging subsections with other contexts are : 
- Personnel / Benefits
- Professional training
- College & Continuing Education
- Recruiting
- Tools & Services
- Legal services
- Infrastructure
- Collaboration

### Now having bunch of startups to investigate I need to make selection for which ones to analyze because our time is limited and this part is not automated
#### For making this selection I've looked up the common startups which held in different dataframes.

Technical approach of this task is to use string matcher method like fuzzy. You can see the related code below.

In [145]:
#function for string matching this will help to choose the common titles in datasets so we can find most related ones and lean on them 
def fuzzy_merge(df_1, df_2, key1, key2, threshold=90, limit=2):
    s = df_2[key2].tolist()
    
    m = df_1[key1].apply(lambda x: process.extract(x, s, limit=limit))    
    df_1['matches'] = m
    
    m2 = df_1['matches'].apply(lambda x: ', '.join([i[0] for i in x if i[1] >= threshold]))
    df_1['matches'] = m2
    
    return df_1

#### For the Turkey Market I have 3 different combinations to look at we can see the concurrent titles below

In [146]:
s500_match_df = fuzzy_merge(df_smarket_hr, df_500_hr_turkey, 'title', 'title', threshold=80)
s500_match_df[(s500_match_df.matches != '')]

Unnamed: 0,title,context,link,link-href,logo-src,matches
14,Kolay İK,İşinizi kolaylaştırır.,İncele,https://startupmarket.co/kolay-ik,https://startupmarket.co/cache/100x100/upload/...,KolayIK


Only common startup in here is Kolay IK


In [None]:
borsa500_match_df = fuzzy_merge(df_500_hr_turkey, df_sborsa_hr, 'title', 'title', threshold=80)
borsa500_match_df[(borsa500_match_df.matches != '')]

As we can see 500 Startups and Startup Borsa do not have any startup in common


In [148]:
marketborsa_match_df = fuzzy_merge(df_sborsa_all, df_smarket_hr, 'title', 'title', threshold=90)
marketborsa_match_df[(marketborsa_match_df.matches != '')]

Unnamed: 0,title,sector,context,matches
49,SmartCV Teknoloji AŞ,İnsan Kaynakları & İşe Alım,İşe alımda oyunlaştırma teknolojisi ve ML kull...,SmartCV
55,FlexyTime,SaaS,FlexyTime evden veya ofisten verimli çalışması...,FlexyTime
66,İdenfit,Yazılım,"Idenfit; insan kaynakları yönetimine, bütüncül...",Idenfit
125,Yersonel.com,İnsan Kaynakları & İşe Alım,Yersonel.com gastronomi sektörüne odaklanmış k...,Yersonel.com
134,Smarthronline,İnsan Kaynakları & İşe Alım,Online career platform connecting logistics pr...,Smarthronline
206,Verified by Sertifier INC.,Eğitim & Eğitim Teknolojileri,"Verified, Sertifier INC. tarafından geliştiril...",Sertifier
214,ilk-is.com,İnsan Kaynakları & İşe Alım,İş arayan veya iş başvurusu yapmak isteyen her...,ilk-is
237,Ara,Tüketici Hizmetleri,“Türkiye’nin Hizmet Arama Motoru” sloganıyla e...,Zumbara Zaman Kumbarası
254,Test Invite,Eğitim & Eğitim Teknolojileri,"Test Invite, sınav ve ölçme-değerlendirme süre...",Test Invite
267,DinamikCRM,SaaS,DinamikCRM yaklaşık 1 yıl süren ürün hazırlıkl...,DinamikCRM


Except some dummy matches like 
> - ara - zumbara 
> - otherside esports - others

conclusions are satisfactory.

With this merge we can lookup to these startups as a starter for the Turkey Market for the limited time being.


#### We can look up to these startups seen as below for global market 

In [162]:
df_builtin_hr['title'] = df_builtin_hr['title'].apply(str)
builtin500_match_df = fuzzy_merge(df_500_hr, df_builtin_hr, 'title', 'title', threshold=90)
builtin500_match_df[(builtin500_match_df.matches != '')]

Unnamed: 0,Technology,title,sector,location,matches
36,SaaS,Cooleaf,Personnel / Benefits,US,Cooleaf
42,Marketplace,Crash,Recruiting,US,Crash
81,Marketplace,Nanno,Early childhood,US,"nan, nan"
86,SaaS,YellowDig,College & Continuing Education,US,Yello
87,Cloud / Content,OKpanda,College / Continuing education,Japan,KPA
153,SaaS,15Five,Personnel / Benefits,US,15Five
158,SaaS,Resource,Recruiting,US,Society for Human Resource Management
194,Marketplace,WorkAmerica,Recruiting,US,WorkAmerica
218,Cloud / Content,ThinkParametric,Professional training,Mexico,KPA
260,-,Greenhouse,Infrastructure,Singapore,Greenhouse Software


Of course these dataframes do not include all startups that can be investigated.  
This can be only a point of view to start looking up while broading up perspective and maybe a way to automate some dataflow.
Addition to datasets might be helpful from sites such as ANGEL LIST.
For the right insights there are too many parameters to handle.

## One theme that sticks out in this outcome of HR startups is recruiting. Several organizations are focused on improving recruiting and hiring processes, whether by offering employers with skills evaluation tools, employing artificial intelligence to analyze job candidates, or establishing better job matching platforms of course while enabling remote work.

## In addition to this, the organizations included above show a rising interest in digitizing standard HR procedures and providing next-level employee engagement and development services.

## My personal evaluation for the trends in this manner comes down to several headlines. These startups are not limited to analyzed data also there are additions with aid of manuel research
### 1. Improving Support For Remote Work And Hiring.  
#### RemoteTeam
        - All-in-one operating system that provides HR solutions to your remote team
#### Flexytime
        - Exterminates the cons of remote working for manager monitorizing
### 2. Supporting Employees’ Physical, Mental, And Emotional Well-Being.  
#### Cooleaf
        - All-in-one engagement software
### 3. Investing In Learning & Development.  
#### 15Five
        - Foster transparency, accountability, and quality feedback.
        - Optimize remote and distributed teams. 
#### Test Invite
        - Also all inclusive approach of assesment phase
### 4. Upgrading Processes With AI And Automation.  
####    SeekOut
            - Feeds on data and have features like AI matching , AI-Powered Talent Search Engine
####    SmartCV
            - Leads less work to recruiters by eliminating candidates by AI