# SQL Project
You were hired by Ironhack to perform an Analytics Consulting Project entitled: competitive landscape.

Your mission is to create and populate an appropriate database with many coding schools that are our competition, as well as design an suitable queries that answer business questions of interest (to be defined by you)


**Suggested Steps in the Project:**


*   Read this notebook and understand each function. Comment the code appropriately

*   Populate the list of schools with a wider variety of schools (how are you going to get the school ID?)

* Take a look at the obtained dataframes. What dimensions do you have? what keys do you have? how could the different dataframes be connected?

* Go back to the drawing board and try to create an entity relationship diagram for tables available

* Once you have the schemas you want, you will need to:
  - create the suitable SQL queries to create the tables and populate them
  - run these queries using the appropriate Python connectors
  
* Bonus: How will this datamodel be updated in the future? Please write auxiliary functions that test the database for data quality issues. For example: how could you make sure you only include the most recent comments when you re-run the script?


# Suggested Deliverables

* 5-6 minute presentation of data model created, decision process and business analysis proposed

* exported .sql file with the final schema

* Supporting python files used to generate all logic

* High level documentation explaining tables designed and focusing on update methods

Crucial hint: check out the following tutorial:
https://www.dataquest.io/blog/sql-insert-tutorial/


In [1]:
# you must populate this dict with the schools required -> try talking to the teaching team about this


schools = {   
'ironhack' : 10828,
'la-capsule' : 10853,
'app-academy' : 10525,
'springboard' : 11035,
'metis' : 10886,
'practicum-by-yandex' : 11225,
'le-wagon' : 10868,
'academia-de-codigo' :10494 ,
'react-graphql-academy' : 10972
}

import re
import pandas as pd
from pandas.io.json import json_normalize
import requests



def get_comments_school(school):
    TAG_RE = re.compile(r'<[^>]+>')
    # defines url to make api call to data -> dynamic with school if you want to scrape competition
    url = "https://www.switchup.org/chimera/v1/school-review-list?mainTemplate=school-review-list&path=%2Fbootcamps%2F" + school + "&isDataTarget=false&page=3&perPage=10000&simpleHtml=true&truncationLength=250"
    #makes get request and converts answer to json
    # url defines the page of all the information, request is made, and information is returned to data variable
    data = requests.get(url).json()
    #converts json to dataframe
    reviews =  pd.DataFrame(data['content']['reviews'])
  
    #aux function to apply regex and remove tags
    def remove_tags(x):
        return TAG_RE.sub('',x)
    reviews['review_body'] = reviews['body'].apply(remove_tags)
    reviews['school'] = school
    return reviews

In [2]:
# could you write this as a list comprehension? ;)
# comments = []

# for school in schools.keys():
#    print(school)
#    comments.append(get_comments_school(school))
    
comments = [get_comments_school(key) for key in schools.keys()]

comments = pd.concat(comments)
comments.sample(50)

Unnamed: 0,id,name,anonymous,hostProgramName,graduatingYear,isAlumni,jobTitle,tagline,body,rawBody,...,queryDate,program,user,overallScore,comments,overall,curriculum,jobSupport,review_body,school
673,240929,Alec Plehn,False,Software Engineering,2018.0,True,,Amazing experience at Ironhack,"<span class=""truncatable""><p>If you want to st...",If you want to start programming but dont know...,...,2018-04-11,Full-time Web Development Bootcamp,{'image': None},5.0,[],5.0,5.0,5.0,If you want to start programming but dont know...,ironhack
379,256160,Sergui Morejón,False,Software Engineering,2019.0,False,,Ironhack change my life !,"<span class=""truncatable""><p>I remember when I...","I remember when I emigrated here to the US, an...",...,2019-10-07,Full-time Web Development Bootcamp,{'image': None},5.0,[],5.0,5.0,5.0,"I remember when I emigrated here to the US, an...",ironhack
562,244853,Jackson,False,Software Engineering,2018.0,True,Software Developer,"Had no coding experience before, now I'm a web...","<span class=""truncatable""><p>Before App Academ...","Before App Academy, all I knew was that I want...",...,2018-09-24,Software Engineer Track: In-Person,{'image': None},5.0,[],5.0,5.0,5.0,"Before App Academy, all I knew was that I want...",app-academy
99,268230,hans cameus,False,,2020.0,False,,Amazing experience ..,"<span class=""truncatable""><p></p><p>I followed...",<p>I followed a first training named Simplon w...,...,2020-09-04,,{'image': None},5.0,[],5.0,5.0,5.0,I followed a first training named Simplon whic...,la-capsule
116,273517,Emily Paul,False,,2021.0,False,Product Designer,Believe the hype,"<span class=""truncatable""><p></p><p>My experie...",<p>My experience with Springboard has been tre...,...,2021-02-09,UX Career Track,{'image': None},5.0,[],5.0,5.0,5.0,My experience with Springboard has been tremen...,springboard
1720,260992,Camille Franceschi,False,Full-Stack Web Development,2017.0,,,Amazing experience,"<span class=""truncatable""><p>After 2.5 years w...","After 2.5 years working in M&A, I was looking ...",...,2017-10-04,FullStack program,{'image': None},5.0,[],5.0,5.0,5.0,"After 2.5 years working in M&amp;A, I was look...",le-wagon
405,264904,Jules Ronne,False,Software Engineering,2020.0,False,,Best pedagogic experience I’ve ever had,"<span class=""truncatable""><p>After a Master de...",After a Master degree in digital project manag...,...,2020-05-24,Web Development Course - Full-Time,{'image': None},5.0,[],5.0,5.0,5.0,After a Master degree in digital project manag...,le-wagon
23,272680,Ariane Gaudeaux,False,,2021.0,True,UX/UI Designer,Just MAGICAL!,"<span class=""truncatable""><p></p><p>Although i...","<p>Although it was remote, it was just MAGICAL...",...,2021-01-22,,{'image': None},5.0,[],5.0,5.0,5.0,"Although it was remote, it was just MAGICAL. I...",ironhack
397,264274,Taylor Brightwell,False,Digital Marketing,2019.0,False,,Mentorship and practice for the win,"<span class=""truncatable""><p>What set's the Sp...",What set's the Springboard Digital Marketing c...,...,2020-05-05,Digital Marketing Career Track,{'image': None},4.0,[],4.0,3.0,5.0,What set's the Springboard Digital Marketing c...,springboard
1175,246374,Kaylin Bittner,False,Software Engineering,2018.0,True,,Le Wagon was wonderful experience and continue...,"<span class=""truncatable""><p>After attending L...","After attending Le Wagon, you are limited only...",...,2019-01-08,FullStack program - 35+ locations,{'image': None},5.0,[],5.0,5.0,5.0,"After attending Le Wagon, you are limited only...",le-wagon


In [3]:
from pandas.io.json import json_normalize

def get_school_info(school, school_id):
    url = 'https://www.switchup.org/chimera/v1/bootcamp-data?mainTemplate=bootcamp-data%2Fdescription&path=%2Fbootcamps%2F'+ str(school) + '&isDataTarget=false&bootcampId='+ str(school_id) + '&logoTag=logo&truncationLength=250&readMoreOmission=...&readMoreText=Read%20More&readLessText=Read%20Less'

    data = requests.get(url).json()

    data.keys()

    courses = data['content']['courses']
    courses_df = pd.DataFrame(courses, columns= ['courses'])

    locations = data['content']['locations']
    locations_df = json_normalize(locations)

    badges_df = pd.DataFrame(data['content']['meritBadges'])
    
    website = data['content']['webaddr']
    description = data['content']['description']
    logoUrl = data['content']['logoUrl']
    school_df = pd.DataFrame([website,description,logoUrl]).T
    school_df.columns =  ['website','description','LogoUrl']

    locations_df['school'] = school
    courses_df['school'] = school
    badges_df['school'] = school
    school_df['school'] = school
    

    locations_df['school_id'] = school_id
    courses_df['school_id'] = school_id
    badges_df['school_id'] = school_id
    school_df['school_id'] = school_id

    return locations_df, courses_df, badges_df, school_df

locations_list = []
courses_list = []
badges_list = []
schools_list = []

for school, id in schools.items():
    print(school)
    a,b,c,d = get_school_info(school,id)
    
    locations_list.append(a)
    courses_list.append(b)
    badges_list.append(c)
    schools_list.append(d)



ironhack


  locations_df = json_normalize(locations)


la-capsule
app-academy
springboard
metis
practicum-by-yandex
le-wagon
academia-de-codigo
react-graphql-academy


In [29]:
locations_list

[      id               description  country.id   country.name country.abbrev  \
 0  15901           Berlin, Germany        57.0        Germany             DE   
 1  16022       Mexico City, Mexico        29.0         Mexico             MX   
 2  16086    Amsterdam, Netherlands        59.0    Netherlands             NL   
 3  16088         Sao Paulo, Brazil        42.0         Brazil             BR   
 4  16109             Paris, France        38.0         France             FR   
 5  16375  Miami, FL, United States         1.0  United States             US   
 6  16376             Madrid, Spain        12.0          Spain             ES   
 7  16377          Barcelona, Spain        12.0          Spain             ES   
 8  16709          Lisbon, Portugal        28.0       Portugal             PT   
 9  17233                    Online         NaN            NaN            NaN   
 
    city.id    city.name city.keyword  state.id state.name state.abbrev  \
 0  31156.0       Berlin       b

In [4]:
locations = pd.concat(locations_list)

In [5]:
locations.sample(5)

Unnamed: 0,id,description,country.id,country.name,country.abbrev,city.id,city.name,city.keyword,state.id,state.name,state.abbrev,state.keyword,school,school_id
1,16771,"Porto, Portugal",28.0,Portugal,PT,31102.0,Porto,porto,,,,,academia-de-codigo,10494
12,16147,"Lille, France",38.0,France,FR,31128.0,Lille,lille,,,,,le-wagon,10868
4,16109,"Paris, France",38.0,France,FR,31136.0,Paris,paris,,,,,ironhack,10828
2,15906,"Buenos Aires, Argentina",60.0,Argentina,AR,31171.0,Buenos Aires,buenos-aires,,,,,le-wagon,10868
26,16767,"Paris, France",38.0,France,FR,31136.0,Paris,paris,,,,,le-wagon,10868


In [31]:
#courses = pd.concat(courses_list)
#courses.head(10)

In [7]:
badges_list

[                name            keyword  \
 0   Available Online   available_online   
 1  Verified Outcomes  verified_outcomes   
 2   Flexible Classes   flexible_classes   
 
                                          description    school  school_id  
 0          <p>School offers fully online courses</p>  ironhack      10828  
 1  <p>School publishes a third-party verified out...  ironhack      10828  
 2  <p>School offers part-time and evening classes...  ironhack      10828  ,
                name           keyword  \
 0  Available Online  available_online   
 1  Flexible Classes  flexible_classes   
 
                                          description      school  school_id  
 0          <p>School offers fully online courses</p>  la-capsule      10853  
 1  <p>School offers part-time and evening classes...  la-capsule      10853  ,
                name           keyword  \
 0  Available Online  available_online   
 1  Flexible Classes  flexible_classes   
 2     Job Guarantee 

In [9]:
badges_raw = pd.concat(badges_list)
badges_raw = badges_raw.drop_duplicates(subset=['name'])
badges_raw


Unnamed: 0,name,keyword,description,school,school_id
0,Available Online,available_online,<p>School offers fully online courses</p>,ironhack,10828
1,Verified Outcomes,verified_outcomes,<p>School publishes a third-party verified out...,ironhack,10828
2,Flexible Classes,flexible_classes,<p>School offers part-time and evening classes...,ironhack,10828
2,Job Guarantee,job_guarantee,<p>School guarantees job placement</p>,app-academy,10525


In [10]:
# superstore.insert(0,'Profitable?',(superstore['Profit'].apply(profitable)))

def badges_m(row):
    if row == 'Available Online':
        return 1
    elif row == 'Verified Outcomes':
        return 2
    elif row == 'Flexible Classes':
        return 3
    elif row == 'Job Guarantee':
        return 4

In [11]:
badges_raw.insert(0,'badges_id',(badges_raw['name'].apply(badges_m)))

In [12]:
badges = badges_raw

In [13]:
badges

Unnamed: 0,badges_id,name,keyword,description,school,school_id
0,1,Available Online,available_online,<p>School offers fully online courses</p>,ironhack,10828
1,2,Verified Outcomes,verified_outcomes,<p>School publishes a third-party verified out...,ironhack,10828
2,3,Flexible Classes,flexible_classes,<p>School offers part-time and evening classes...,ironhack,10828
2,4,Job Guarantee,job_guarantee,<p>School guarantees job placement</p>,app-academy,10525


In [14]:
badges.columns = ['badges_id','name','keyword','description','school','school_id']

In [15]:
clean_badges = badges[['badges_id','name']]

In [16]:
display(clean_badges)

Unnamed: 0,badges_id,name
0,1,Available Online
1,2,Verified Outcomes
2,3,Flexible Classes
2,4,Job Guarantee


In [17]:
clean_schools = locations[['school_id','school']]
clean_schools.columns = ['schools_id','name']
clean_schools = clean_schools.drop_duplicates(subset=['schools_id'])


In [18]:
clean_schools

Unnamed: 0,schools_id,name
0,10828,ironhack
0,10853,la-capsule
0,10525,app-academy
0,11035,springboard
0,10886,metis
0,11225,practicum-by-yandex
0,10868,le-wagon
0,10494,academia-de-codigo
0,10972,react-graphql-academy


In [25]:
sub_df = pd.concat(badges_list)
clean_badges_schools = sub_df[['name','school_id']]
clean_badges_schools = clean_badges_schools.merge(clean_badges,how='inner',on='name')
clean_badges_schools.insert(0,'school_badges_id',(range(1,20)))
clean_badges_schools = clean_badges_schools[['school_badges_id','school_id','badges_id']]
clean_badges_schools

Unnamed: 0,school_badges_id,school_id,badges_id
0,1,10828,1
1,2,10853,1
2,3,10525,1
3,4,11035,1
4,5,10886,1
5,6,11225,1
6,7,10868,1
7,8,10972,1
8,9,10828,2
9,10,10828,3


In [28]:
comments

Unnamed: 0,id,name,anonymous,hostProgramName,graduatingYear,isAlumni,jobTitle,tagline,body,rawBody,...,queryDate,program,user,overallScore,comments,overall,curriculum,jobSupport,review_body,school
0,276568,Guilherme golabek brein,False,,2018.0,False,Senior Associate,Improper billing,"<span class=""truncatable""><p></p><p>A year aft...","<p>A year after completing my course, ironhack...",...,2021-04-30,Web Development Part-Time,{'image': None},1.0,[],1.0,1.0,1.0,"A year after completing my course, ironhack co...",ironhack
1,276147,Charlotte Urvoy,False,,2021.0,False,UX UI Designer,Riche et pragmatique,"<span class=""truncatable""><p></p><p>- La métho...",<p>- La m&eacute;thode d&#39;apprentissage est...,...,2021-04-20,UX/UI Design Bootcamp,{'image': None},5.0,[],5.0,5.0,5.0,- La méthode d'apprentissage est l'une des mei...,ironhack
2,275972,Anonymous,True,,2021.0,False,,Amazing experience,"<span class=""truncatable""><p></p><p>the UX/UI ...",<p>the UX/UI bootcamp has been an amazing lear...,...,2021-04-17,UX/UI Design Bootcamp,{'image': None},4.0,[],5.0,4.0,3.0,the UX/UI bootcamp has been an amazing learnin...,ironhack
3,275872,Ahmad Khalaf,False,,2021.0,False,Product Designer,Intense but good experience,"<span class=""truncatable""><p></p><p>When I sta...",<p>When I started I was a little disappointed ...,...,2021-04-15,UX/UI Design Bootcamp,{'image': None},4.0,[],4.0,4.0,4.0,When I started I was a little disappointed but...,ironhack
4,275855,Morgane Favchtein,False,,2021.0,False,UX UI Designer,Very nice experience !,"<span class=""truncatable""><p></p><p>The UX UI ...",<p>The UX UI Design bootcamp is a great way to...,...,2021-04-14,UX/UI Design Bootcamp,{'image': None},4.3,[],5.0,4.0,4.0,The UX UI Design bootcamp is a great way to tr...,ironhack
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
29,238594,Tiago Gomes,False,Software Engineering,2018.0,True,Software Developer,A life change experience.,"<p>The curriculum is really good, we really di...","The curriculum is really good, we really dive ...",...,2018-11-26,"1 Week React, Redux & GraphQL Bootcamp",{'image': None},5.0,[],5.0,5.0,5.0,"The curriculum is really good, we really dive ...",react-graphql-academy
30,245212,Peter McCarthy,False,,2018.0,True,Front End Developer,ReactJS Academy part time course,"<p>Brilliant course, covers absolutely everyth...","Brilliant course, covers absolutely everything...",...,2018-11-23,"Part time React, Redux and GraphQL",{'image': None},5.0,[],5.0,5.0,,"Brilliant course, covers absolutely everything...",react-graphql-academy
31,245192,Polly S,False,Software Engineering,2018.0,True,Frontend engineer,Great intense week of learning and practice,"<span class=""truncatable""><p>A week of encapsu...","A week of encapsulated learning, lots of infor...",...,2018-11-23,"1 Week React, Redux & GraphQL Bootcamp",{'image': None},4.7,[],5.0,4.0,5.0,"A week of encapsulated learning, lots of infor...",react-graphql-academy
32,238091,Francisco Gomes,False,Software Engineering,2018.0,True,Web Developer,Totally worth it!,"<span class=""truncatable""><p>I've attended 1-w...","I've attended 1-week in Lisbon, 1-week in Lond...",...,2018-11-07,"1 Week React, Redux & GraphQL Bootcamp",{'image': None},5.0,[],5.0,5.0,5.0,"I've attended 1-week in Lisbon, 1-week in Lond...",react-graphql-academy


In [44]:
sub_df2 = clean_schools.rename(columns={'name':'school'})
clean_comments = comments.merge(sub_df2, how='inner', on='school')
display(clean_comments.columns)
to_drop = ['anonymous', 'hostProgramName', 'graduatingYear','jobTitle', 'tagline', 'body', 'rawBody', 'createdAt',
       'queryDate', 'user', 'comments', 'review_body','school']
clean_comments.drop(to_drop, inplace=True,axis=1)
clean_comments = clean_comments.fillna(0)
clean_comments['overall'] = clean_comments['overall'].apply(lambda x : float(x))
clean_comments['overallScore'] = clean_comments['overallScore'].apply(lambda x : float(x))
clean_comments['curriculum'] = clean_comments['curriculum'].apply(lambda x : float(x))
clean_comments['jobSupport'] = clean_comments['jobSupport'].apply(lambda x : float(x))

Index(['id', 'name', 'anonymous', 'hostProgramName', 'graduatingYear',
       'isAlumni', 'jobTitle', 'tagline', 'body', 'rawBody', 'createdAt',
       'queryDate', 'program', 'user', 'overallScore', 'comments', 'overall',
       'curriculum', 'jobSupport', 'review_body', 'school', 'schools_id'],
      dtype='object')

Unnamed: 0,id,name,isAlumni,program,overallScore,overall,curriculum,jobSupport,schools_id
1822,232079,Timmy Jing,True,0,5.0,5.0,5.0,0.0,10525
1776,244245,Brandon Woodruff,True,Software Engineer Track: In-Person,5.0,5.0,5.0,5.0,10525
459,246547,Anna Antràs Marti,True,Full-time UX/UI Design Bootcamp,5.0,5.0,5.0,5.0,10828
4177,255062,Joe,True,0,4.7,5.0,5.0,4.0,10868
1180,250895,Benoit,True,0,5.0,5.0,5.0,5.0,10853
2907,247814,Engin Turkmen,True,Data Science Career Track,5.0,5.0,5.0,5.0,11035
4768,244295,Irvin,True,FullStack program - 35+ locations,5.0,5.0,5.0,0.0,10868
3080,244799,Prashaanth Jagannathan,True,0,4.7,5.0,5.0,4.0,11035
131,265376,Mathias Gautier,False,Web Development Bootcamp,5.0,5.0,5.0,5.0,10828
4277,250707,Dan A,True,FullStack program - 35+ locations,5.0,5.0,5.0,0.0,10868


In [45]:
locations_clean = locations.copy()
to_drop = ['description','country.id','country.abbrev','city.id','city.keyword','state.id','state.name','state.abbrev','state.keyword']
locations_clean.drop(to_drop,inplace=True,axis=1,)
locations_clean.rename(columns = {'id':'location_id','country.name':'country','city.name':'city'}, inplace = True)

In [46]:
clean_locations = locations_clean
clean_locations = clean_locations.fillna('Online')

In [49]:
clean_locations = clean_locations[['location_id','country','city','school_id']]

In [50]:
display(clean_schools)
display(clean_badges)
display(clean_badges_schools)
display(clean_comments)
display(clean_locations)

Unnamed: 0,schools_id,name
0,10828,ironhack
0,10853,la-capsule
0,10525,app-academy
0,11035,springboard
0,10886,metis
0,11225,practicum-by-yandex
0,10868,le-wagon
0,10494,academia-de-codigo
0,10972,react-graphql-academy


Unnamed: 0,badges_id,name
0,1,Available Online
1,2,Verified Outcomes
2,3,Flexible Classes
2,4,Job Guarantee


Unnamed: 0,school_badges_id,school_id,badges_id
0,1,10828,1
1,2,10853,1
2,3,10525,1
3,4,11035,1
4,5,10886,1
5,6,11225,1
6,7,10868,1
7,8,10972,1
8,9,10828,2
9,10,10828,3


Unnamed: 0,id,name,isAlumni,program,overallScore,overall,curriculum,jobSupport,schools_id
0,276568,Guilherme golabek brein,False,Web Development Part-Time,1.0,1.0,1.0,1.0,10828
1,276147,Charlotte Urvoy,False,UX/UI Design Bootcamp,5.0,5.0,5.0,5.0,10828
2,275972,Anonymous,False,UX/UI Design Bootcamp,4.0,5.0,4.0,3.0,10828
3,275872,Ahmad Khalaf,False,UX/UI Design Bootcamp,4.0,4.0,4.0,4.0,10828
4,275855,Morgane Favchtein,False,UX/UI Design Bootcamp,4.3,5.0,4.0,4.0,10828
...,...,...,...,...,...,...,...,...,...
5485,238594,Tiago Gomes,True,"1 Week React, Redux & GraphQL Bootcamp",5.0,5.0,5.0,5.0,10972
5486,245212,Peter McCarthy,True,"Part time React, Redux and GraphQL",5.0,5.0,5.0,0.0,10972
5487,245192,Polly S,True,"1 Week React, Redux & GraphQL Bootcamp",4.7,5.0,4.0,5.0,10972
5488,238091,Francisco Gomes,True,"1 Week React, Redux & GraphQL Bootcamp",5.0,5.0,5.0,5.0,10972


Unnamed: 0,location_id,country,city,school_id
0,15901,Germany,Berlin,10828
1,16022,Mexico,Mexico City,10828
2,16086,Netherlands,Amsterdam,10828
3,16088,Brazil,Sao Paulo,10828
4,16109,France,Paris,10828
...,...,...,...,...
1,16749,Portugal,Lisbon,10972
2,17023,Netherlands,Amsterdam,10972
3,17242,Online,Online,10972
4,17251,Germany,Berlin,10972


In [51]:
#################CREATING A CONNECTION TO THE DATABASE#####################
import pymysql
import getpass
import mysql.connector
from sqlalchemy import create_engine

# Connect to the database

host="database-1.cesj3b2ko52z.us-east-2.rds.amazonaws.com"
port=3306
dbname="project"
user="root"
password="12345678"


#engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
#                       .format(user="root",
#                               pw="1234",
#                               db="project"))

engine = create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'
            .format(user, password,host, dbname)).connect()


# create cursor


In [52]:
clean_comments.to_sql('comments', con = engine,if_exists = 'replace')

In [53]:
clean_schools.to_sql('schools', con = engine,if_exists = 'replace')

In [54]:
clean_badges.to_sql('badges', con = engine,if_exists = 'replace')

In [55]:
clean_badges_schools.to_sql('school_badges_id', con = engine,if_exists = 'replace')

In [56]:
clean_locations.to_sql('locations', con = engine,if_exists = 'replace')

In [73]:
query = pd.read_sql_query("""SELECT AVG(ct.overallScore) AS overallScore, sc.name, ct.schools_id
FROM comments ct
JOIN schools sc
ON ct.schools_id = sc.schools_id
GROUP BY ct.schools_id
ORDER BY overallScore DESC;""", engine)

In [74]:
overall_score = query

In [68]:
total_reviews

Unnamed: 0,Total_reviews,name
0,1979,le-wagon
1,1054,app-academy
2,1045,ironhack
3,923,springboard
4,159,la-capsule
5,117,metis
6,97,academia-de-codigo
7,82,practicum-by-yandex
8,34,react-graphql-academy


In [75]:
overall_score

Unnamed: 0,overallScore,name,schools_id
0,4.987421,la-capsule,10853
1,4.960825,academia-de-codigo,10494
2,4.927792,le-wagon,10868
3,4.821368,metis,10886
4,4.801914,ironhack,10828
5,4.764706,react-graphql-academy,10972
6,4.728049,practicum-by-yandex,11225
7,4.586433,app-academy,10525
8,4.564139,springboard,11035


In [77]:
labels = ["Very Low", "Low", "Moderate", "High", "Very High"]

qbins = pd.qcut(overall_score['overallScore'],5, labels=labels)

In [78]:
overall_score['qcut'] = qbins

In [79]:
overall_score = overall_score[['overallScore','name','qcut']]

In [80]:
overall_score

Unnamed: 0,overallScore,name,qcut
0,4.987421,la-capsule,Very High
1,4.960825,academia-de-codigo,Very High
2,4.927792,le-wagon,High
3,4.821368,metis,High
4,4.801914,ironhack,Moderate
5,4.764706,react-graphql-academy,Low
6,4.728049,practicum-by-yandex,Low
7,4.586433,app-academy,Very Low
8,4.564139,springboard,Very Low
