<a href="https://colab.research.google.com/github/seungwon0601/Tech_Online_Courses_Analysis/blob/master/Most_Popular_Couses_in_coursera.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# What are the most popular courses in Coursera?
This is Data Analysis Project with Coursera course dataset(kaggle).

- First, Let's look at Coursera courses related to 'Data'
- And then Look at the entire courses by programming language(or Frameworks)
- Then by type(field or sector)


In [None]:
import pandas as pd
import numpy as np

In [None]:
data = pd.read_csv('/content/coursea_data.csv')

## Step 1. Data Selection

### Coursera Course Dataset | [Kaggle](https://www.kaggle.com/siddharthm1698/coursera-course-dataset#)
>This dataset contains mainly 6 columns and 890 course data. The detailed description:
1. **course_title** : Contains the course title.
2. **course_organization** : It tells which organization is conducting the courses.
3. **courseCertificatetype** : It has details about what are the different certifications available in courses.
4. **course_rating** : It has the ratings associated with each course.
5. **course_difficulty** : It tells about how difficult or what is the level of the course.
6. **coursestudentsenrolled** : It has the number of students that are enrolled in the course.

## Step 2. Data Lookup

In [None]:
# columns를 조회합니다.
data.head()

Unnamed: 0.1,Unnamed: 0,course_title,course_organization,course_Certificate_type,course_rating,course_difficulty,course_students_enrolled
0,134,(ISC)² Systems Security Certified Practitioner...,(ISC)²,SPECIALIZATION,4.7,Beginner,5.3k
1,743,A Crash Course in Causality: Inferring Causal...,University of Pennsylvania,COURSE,4.7,Intermediate,17k
2,874,A Crash Course in Data Science,Johns Hopkins University,COURSE,4.5,Mixed,130k
3,413,A Law Student's Toolkit,Yale University,COURSE,4.7,Mixed,91k
4,635,A Life of Happiness and Fulfillment,Indian School of Business,COURSE,4.8,Mixed,320k


In [None]:
# 분석에 필요한 데이터만 요약합니다.
data[['course_title','course_rating','course_difficulty','course_students_enrolled']]

Unnamed: 0,course_title,course_rating,course_difficulty,course_students_enrolled
0,(ISC)² Systems Security Certified Practitioner...,4.7,Beginner,5.3k
1,A Crash Course in Causality: Inferring Causal...,4.7,Intermediate,17k
2,A Crash Course in Data Science,4.5,Mixed,130k
3,A Law Student's Toolkit,4.7,Mixed,91k
4,A Life of Happiness and Fulfillment,4.8,Mixed,320k
...,...,...,...,...
886,Программирование на Python,4.5,Intermediate,52k
887,Психолингвистика (Psycholinguistics),4.8,Mixed,21k
888,Разработка интерфейсов: вёрстка и JavaScript,4.5,Intermediate,30k
889,Русский как иностранный,4.6,Intermediate,9.8k


## Step 3. Create DataFrame

In [None]:
# 데이터프레임을 만들어봅니다.
df = pd.DataFrame(data[['course_title','course_rating','course_difficulty','course_students_enrolled']])
df

Unnamed: 0,course_title,course_rating,course_difficulty,course_students_enrolled
0,(ISC)² Systems Security Certified Practitioner...,4.7,Beginner,5.3k
1,A Crash Course in Causality: Inferring Causal...,4.7,Intermediate,17k
2,A Crash Course in Data Science,4.5,Mixed,130k
3,A Law Student's Toolkit,4.7,Mixed,91k
4,A Life of Happiness and Fulfillment,4.8,Mixed,320k
...,...,...,...,...
886,Программирование на Python,4.5,Intermediate,52k
887,Психолингвистика (Psycholinguistics),4.8,Mixed,21k
888,Разработка интерфейсов: вёрстка и JavaScript,4.5,Intermediate,30k
889,Русский как иностранный,4.6,Intermediate,9.8k


In [None]:
df.columns=['title', 'rating', 'level', 'enrolled']
df

Unnamed: 0,title,rating,level,enrolled
0,(ISC)² Systems Security Certified Practitioner...,4.7,Beginner,5.3k
1,A Crash Course in Causality: Inferring Causal...,4.7,Intermediate,17k
2,A Crash Course in Data Science,4.5,Mixed,130k
3,A Law Student's Toolkit,4.7,Mixed,91k
4,A Life of Happiness and Fulfillment,4.8,Mixed,320k
...,...,...,...,...
886,Программирование на Python,4.5,Intermediate,52k
887,Психолингвистика (Psycholinguistics),4.8,Mixed,21k
888,Разработка интерфейсов: вёрстка и JavaScript,4.5,Intermediate,30k
889,Русский как иностранный,4.6,Intermediate,9.8k


## Step 4. Preprocessing

In [None]:
# level column 확인
df['level'].unique()

array(['Beginner', 'Intermediate', 'Mixed', 'Advanced'], dtype=object)

In [None]:
# level column을 숫자형태로 변형합니다.
def levels(x):
    if x == 'Beginner':
        return 1
    elif x == 'Mixed':
        return 2
    elif x == 'Intermediate':
        return 3
    elif x == 'Advanced':
        return 4

In [None]:
df['level'] = df['level'].apply(levels)
df['level'].unique()

array([1, 3, 2, 4])

In [None]:
# enrolled column 확인
df['enrolled']

0      5.3k
1       17k
2      130k
3       91k
4      320k
       ... 
886     52k
887     21k
888     30k
889    9.8k
890     38k
Name: enrolled, Length: 891, dtype: object

In [None]:
# 단위(k)를 숫자로 변형합니다.
def counts(x):
    rx = x.replace('k','000')
    if '.' in rx:
        rx = rx.replace('.','')
        rx = rx[:-1]
        return int(rx)
    return int(rx)

In [None]:
# counts()함수가 잘 동작하는지 테스트합니다.
df['enrolled'].apply(counts)

0        5300
1       17000
2      130000
3       91000
4      320000
        ...  
886     52000
887     21000
888     30000
889      9800
890     38000
Name: enrolled, Length: 891, dtype: int64

In [None]:
# 데이터 프레임에 적용합니다.
df['enrolled'] = df['enrolled'].apply(counts)
df

Unnamed: 0,title,rating,level,enrolled
0,(ISC)² Systems Security Certified Practitioner...,4.7,1,5300
1,A Crash Course in Causality: Inferring Causal...,4.7,3,17000
2,A Crash Course in Data Science,4.5,2,130000
3,A Law Student's Toolkit,4.7,2,91000
4,A Life of Happiness and Fulfillment,4.8,2,320000
...,...,...,...,...
886,Программирование на Python,4.5,3,52000
887,Психолингвистика (Psycholinguistics),4.8,2,21000
888,Разработка интерфейсов: вёрстка и JavaScript,4.5,3,30000
889,Русский как иностранный,4.6,3,9800


In [None]:
# 중앙값을 확인해봅니다.
df.median()

rating          4.7
level           1.0
enrolled    42000.0
dtype: float64

In [None]:
# 평균을 확인해봅니다.
df.mean()

rating          4.677329
level           1.718294
enrolled    81012.328844
dtype: float64

In [None]:
# 통계 요약 정보를 확인해봅니다.
df.describe()

Unnamed: 0,rating,level,enrolled
count,891.0,891.0,891.0
mean,4.677329,1.718294,81012.328844
std,0.162225,0.880687,108555.686774
min,3.3,1.0,13.0
25%,4.6,1.0,17000.0
50%,4.7,1.0,42000.0
75%,4.8,2.0,97500.0
max,5.0,4.0,830000.0


In [None]:
# 데이터 프레임에서 'Data' 단어를 포함하고 있는 강의를 추출/확인합니다.
df[df['title'].str.contains('Data')]

Unnamed: 0,title,rating,level,enrolled
1,A Crash Course in Causality: Inferring Causal...,4.7,3,17000
2,A Crash Course in Data Science,4.5,2,130000
27,Advanced Data Science with IBM,4.4,4,320000
54,Applied Data Science,4.6,1,220000
55,Applied Data Science Capstone,4.7,3,42000
...,...,...,...,...
825,Tools for Data Science,4.6,1,120000
841,Understanding and Visualizing Data with Python,4.7,1,30000
849,Using Databases with Python,4.8,2,220000
850,Using Python to Access Web Data,4.8,2,310000


In [None]:
# 데이터 프레임에서 'data' 단어를 포함하고 있는 강의를 추출/확인합니다.
df[df['title'].str.contains('data')]

Unnamed: 0,title,rating,level,enrolled


In [None]:
# 모두 'D'(대문자) 형태로 시작합니다. 'Data'를 포함하는 강의를 최다 등록 순으로 정렬합니다.
df[df['title'].str.contains('Data')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
196,Data Science,4.5,1,830000
199,Data Science: Foundations using R,4.6,1,740000
56,Applied Data Science with Python,4.5,3,480000
420,IBM Data Science,4.6,1,480000
800,The Data Scientist’s Toolbox,4.6,2,420000
...,...,...,...,...
612,Modernizing Data Lakes and Data Warehouses wit...,4.7,3,9100
89,Big Data – Introducción al uso práctico de dat...,4.5,1,8800
104,Building Batch Data Pipelines on GCP,4.5,3,7300
405,Health Information Literacy for Data Analytics,4.5,3,5400


In [None]:
# 평점이 너무 낮은 강의는 제외합니다. 4이하의 강의가 있는지 확인합니다.
df[df['title'].str.contains('Data')].sort_values(by=['enrolled'], ascending=False)['rating'].min()

4.2

In [None]:
# 평점은 모두 좋은 편입니다. 새로운 변수로 저장해둡니다.
Courses_data = df[df['title'].str.contains('Data')].sort_values(by=['enrolled'], ascending=False)

In [None]:
# 잘 저장되었는지 확인합니다.
Courses_data.head()

Unnamed: 0,title,rating,level,enrolled
196,Data Science,4.5,1,830000
199,Data Science: Foundations using R,4.6,1,740000
56,Applied Data Science with Python,4.5,3,480000
420,IBM Data Science,4.6,1,480000
800,The Data Scientist’s Toolbox,4.6,2,420000


In [None]:
# 수강생이 100,000 명 이상인 강의만 추출합니다.
Courses_data[Courses_data['enrolled']>100000]

Unnamed: 0,title,rating,level,enrolled
196,Data Science,4.5,1,830000
199,Data Science: Foundations using R,4.6,1,740000
56,Applied Data Science with Python,4.5,3,480000
420,IBM Data Science,4.6,1,480000
800,The Data Scientist’s Toolbox,4.6,2,420000
684,Python Data Structures,4.9,2,420000
487,Introduction to Data Science in Python,4.5,3,390000
27,Advanced Data Science with IBM,4.4,4,320000
850,Using Python to Access Web Data,4.8,2,310000
486,Introduction to Data Science,4.6,1,310000


In [None]:
# 강의 수를 확인합니다.
len(Courses_data[Courses_data['enrolled']>100000])

36

In [None]:
# 100000 enrolled 강의를 저장하고, Top 10 강의도 따로 추출해둡니다.
AboutData_BestSellers = Courses_data[Courses_data['enrolled']>100000]
AboutData_Top_Ten = BestSellers[:10]

In [None]:
AboutData_Top_Ten

Unnamed: 0,title,rating,level,enrolled
196,Data Science,4.5,1,830000
199,Data Science: Foundations using R,4.6,1,740000
56,Applied Data Science with Python,4.5,3,480000
420,IBM Data Science,4.6,1,480000
800,The Data Scientist’s Toolbox,4.6,2,420000
684,Python Data Structures,4.9,2,420000
487,Introduction to Data Science in Python,4.5,3,390000
27,Advanced Data Science with IBM,4.4,4,320000
850,Using Python to Access Web Data,4.8,2,310000
486,Introduction to Data Science,4.6,1,310000


## Step 5. Apply - Language/Framework
지금까지 Coursera의 890개 강의에 대해 'Data'를 포함하고 있는 강의들을 누적 수강생 순으로 추출해냈습니다.
이와 같은 방식으로 언어별 혹은 프레임워크별 Best Seller를 확인해보겠습니다.

1. **Python** Best Sellers
2. **R** Best Sellers
3. **Java** Best Sellers
4. **JavaScrips** Best Sellers
5. **Django** Best Sellers
6. **Flask** Best Sellers
7. **React** Best Sellers
8. **Vue** Best Sellers
9. **Tensorflow** Best Sellers
10. **PyTorch** Best Sellers

### 1. Python Best Sellers

In [None]:
# 모든 Python 강의 확인
df[df['title'].str.contains('Python')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
56,Applied Data Science with Python,4.5,3,480000
684,Python Data Structures,4.9,2,420000
487,Introduction to Data Science in Python,4.5,3,390000
850,Using Python to Access Web Data,4.8,2,310000
849,Using Databases with Python,4.8,2,220000
687,Python for Data Science and AI,4.6,1,170000
57,Applied Machine Learning in Python,4.6,3,150000
570,Machine Learning with Python,4.7,3,120000
530,Introdução à Ciência da Computação com Python ...,4.9,1,120000
682,Python Basics,4.8,1,110000


In [None]:
# 누적 수강생 10만건 이상 Python 강의
Python_Courses = df[df['title'].str.contains('Python')].sort_values(by=['enrolled'], ascending=False)
Python_Courses[Python_Courses['enrolled']>100000]

Unnamed: 0,title,rating,level,enrolled
56,Applied Data Science with Python,4.5,3,480000
684,Python Data Structures,4.9,2,420000
487,Introduction to Data Science in Python,4.5,3,390000
850,Using Python to Access Web Data,4.8,2,310000
849,Using Databases with Python,4.8,2,220000
687,Python for Data Science and AI,4.6,1,170000
57,Applied Machine Learning in Python,4.6,3,150000
570,Machine Learning with Python,4.7,3,120000
530,Introdução à Ciência da Computação com Python ...,4.9,1,120000
682,Python Basics,4.8,1,110000


In [None]:
# 데이터 저장
Python_BestSellers = Python_Courses[Python_Courses['enrolled']>100000]

### 2. R Best Sellers

In [None]:
# R 강의 확인(1)
df[df['title'].str.contains('R ')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
691,R Programming,4.6,3,480000
415,Human Resource Management: HR for People Managers,4.7,1,98000
751,Statistical Analysis with R for Public Health,4.7,1,16000
844,Unity XR: How to Build AR and VR Apps,4.2,1,12000


In [None]:
# R 강의 확인(1)
# 691행만 추출합니다.
df[df['title'].str.contains('R ')].sort_values(by=['enrolled'], ascending=False).iloc[0,:]

title       R Programming
rating                4.6
level                   3
enrolled           480000
Name: 691, dtype: object

In [None]:
# 추출한 행 저장
R_temp_1 = df[df['title'].str.contains('R ')].sort_values(by=['enrolled'], ascending=False).iloc[0,:]
R_temp_1=pd.DataFrame(R_temp_1).T

In [None]:
# R 강의 확인(2)
df[df['title'].str[-1] =='R'].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
199,Data Science: Foundations using R,4.6,1,740000
753,Statistics with R,4.7,1,220000
652,Photography Basics and Beyond: From Smartphone...,4.7,1,160000
465,Introducción a Data Science: Programación Esta...,4.7,1,140000
592,Mastering Software Development in R,4.3,1,52000


In [None]:
# R 강의 확인(2)
# 199, 753행만 추출합니다.
df[df['title'].str[-1] =='R'].sort_values(by=['enrolled'], ascending=False).iloc[:2,:]

Unnamed: 0,title,rating,level,enrolled
199,Data Science: Foundations using R,4.6,1,740000
753,Statistics with R,4.7,1,220000


In [None]:
# 추출한 행 저장
R_temp_2 = df[df['title'].str[-1] =='R'].sort_values(by=['enrolled'], ascending=False).iloc[:2,:]

In [None]:
pd.concat([R_temp_1, R_temp_2])

Unnamed: 0,title,rating,level,enrolled
691,R Programming,4.6,3,480000
199,Data Science: Foundations using R,4.6,1,740000
753,Statistics with R,4.7,1,220000


In [None]:
# 베스트셀러 저장
R_BestSellers = pd.concat([R_temp_1, R_temp_2])

### 3. Java Best Sellers

In [None]:
# Java 강의 확인
df[df['title'].str.contains('Java')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
535,Java Programming and Software Engineering Fund...,4.6,1,380000
633,Object Oriented Programming in Java,4.6,1,330000
632,Object Oriented Java Programming: Data Structu...,4.7,3,250000
672,"Programming Foundations with JavaScript, HTML ...",4.6,1,250000
399,"HTML, CSS, and Javascript for Web Developers",4.8,2,240000
536,Java Programming: Solving Problems with Software,4.6,1,160000
538,Kotlin for Java Developers,4.7,3,37000
645,"Parallel, Concurrent, and Distributed Programm...",4.5,3,30000
888,Разработка интерфейсов: вёрстка и JavaScript,4.5,3,30000
107,Building Scalable Java Microservices with Spri...,4.3,3,24000


In [None]:
Java_Courses = df[df['title'].str.contains('Java')].sort_values(by=['enrolled'], ascending=False)
Java_BestSellers = Java_Courses[Java_Courses['enrolled']>100000]
Java_BestSellers

Unnamed: 0,title,rating,level,enrolled
535,Java Programming and Software Engineering Fund...,4.6,1,380000
633,Object Oriented Programming in Java,4.6,1,330000
632,Object Oriented Java Programming: Data Structu...,4.7,3,250000
672,"Programming Foundations with JavaScript, HTML ...",4.6,1,250000
399,"HTML, CSS, and Javascript for Web Developers",4.8,2,240000
536,Java Programming: Solving Problems with Software,4.6,1,160000


In [None]:
# JavaScript도 포함되어 있습니다. 제외해줍니다.
Java_BestSellers = Java_BestSellers[Java_BestSellers['title'].str.contains('JavaScript')==False]
Java_BestSellers = Java_BestSellers[Java_BestSellers['title'].str.contains('Javascript')==False]
Java_BestSellers

Unnamed: 0,title,rating,level,enrolled
535,Java Programming and Software Engineering Fund...,4.6,1,380000
633,Object Oriented Programming in Java,4.6,1,330000
632,Object Oriented Java Programming: Data Structu...,4.7,3,250000
536,Java Programming: Solving Problems with Software,4.6,1,160000


### 4. JavaScript Best Sellers


In [None]:
# JavaScript 강의 확인
df[df['title'].str.contains('JavaScript')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
672,"Programming Foundations with JavaScript, HTML ...",4.6,1,250000
888,Разработка интерфейсов: вёрстка и JavaScript,4.5,3,30000


In [None]:
JavaScript_temp_1 = df[df['title'].str.contains('JavaScript')].sort_values(by=['enrolled'], ascending=False).iloc[0,:]
JavaScript_temp_1 = pd.DataFrame(JavaScript_temp_1).T

In [None]:
# 병합 전 임시 할당
JavaScript_temp_1

Unnamed: 0,title,rating,level,enrolled
672,"Programming Foundations with JavaScript, HTML ...",4.6,1,250000


In [None]:
df[df['title'].str.contains('Javascript')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
399,"HTML, CSS, and Javascript for Web Developers",4.8,2,240000


In [None]:
# 병합 전 임시 할당
JavaScript_temp_2 = df[df['title'].str.contains('Javascript')].sort_values(by=['enrolled'], ascending=False)

In [None]:
pd.concat([JavaScript_temp_1, JavaScript_temp_2])

Unnamed: 0,title,rating,level,enrolled
672,"Programming Foundations with JavaScript, HTML ...",4.6,1,250000
399,"HTML, CSS, and Javascript for Web Developers",4.8,2,240000


In [None]:
# 베스트셀러 저장
JavaScript_BestSellers = pd.concat([JavaScript_temp_1, JavaScript_temp_2])

### 5. Django Best Sellers

In [None]:
# Django 강의 확인
df[df['title'].str.contains('Django')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


### 6. Flask Best Sellers

In [None]:
# Flask 강의 확인
df[df['title'].str.contains('Flask')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


### 7. React Best Sellers

In [None]:
# React 강의 확인
df[df['title'].str.contains('React')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
339,Full-Stack Web Development with React,4.7,3,140000
476,Introduction to Chemistry: Reactions and Ratios,4.8,1,46000
336,Front-End Web Development with React,4.7,3,45000


In [None]:
React_Courses = df[df['title'].str.contains('React')].sort_values(by=['enrolled'], ascending=False)
React_BestSellers = React_Courses[React_Courses['enrolled']>100000]

In [None]:
# 베스트셀러 저장
React_BestSellers

Unnamed: 0,title,rating,level,enrolled
339,Full-Stack Web Development with React,4.7,3,140000


### 8. Vue Best Sellers

In [None]:
# Vue 강의 확인
df[df['title'].str.contains('Vue')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


In [None]:
# Vue 강의 확인
df[df['title'].str.contains('vue')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


### 9. Tensorflow Best Sellers

In [None]:
# Tensorflow 강의 확인
df[df['title'].str.contains('TensorFlow')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
787,TensorFlow in Practice,4.7,3,170000
520,Introduction to TensorFlow for Artificial Inte...,4.7,3,150000
571,Machine Learning with TensorFlow on Google Clo...,4.5,3,72000
163,Convolutional Neural Networks in TensorFlow,4.7,3,46000
621,Natural Language Processing in TensorFlow,4.6,3,40000
29,Advanced Machine Learning with TensorFlow on G...,4.5,4,35000
788,TensorFlow: Data and Deployment,4.5,3,12000


In [None]:
Tensorflow_Courses = df[df['title'].str.contains('TensorFlow')].sort_values(by=['enrolled'], ascending=False)
Tensorflow_BestSellers = Tensorflow_Courses[Tensorflow_Courses['enrolled']>100000]
# 베스트셀러 저장
Tensorflow_BestSellers

Unnamed: 0,title,rating,level,enrolled
787,TensorFlow in Practice,4.7,3,170000
520,Introduction to TensorFlow for Artificial Inte...,4.7,3,150000


### 10. PyTorch Best Sellers

In [None]:
# Tensorflow 강의 확인
df[df['title'].str.contains('PyTorch')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


## Step 6. Apply - Sector
다음으로 분야별 인기 강의를 알아봅니다.

1. Web
2. App
3. Algorithm
4. Computer
5. hacking
6. Finance
7. Commerce
8. neural network
9. machine learning 
10. deep learning

### 1. Web

In [None]:
# Web 강의 확인
df[df['title'].str.contains('Web')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
850,Using Python to Access Web Data,4.8,2,310000
859,Web Design for Everybody: Basics of Web Develo...,4.7,1,280000
399,"HTML, CSS, and Javascript for Web Developers",4.8,2,240000
701,Responsive Website Development and Design,4.5,1,170000
338,Full Stack Web and Multiplatform Mobile App De...,4.7,3,150000
339,Full-Stack Web Development with React,4.7,3,140000
413,How To Create a Website in a Weekend! (Project...,3.3,2,140000
858,Web Applications for Everybody,4.8,3,120000
337,Front-End Web UI Frameworks and Tools: Bootstr...,4.8,3,89000
526,Introduction to Web Development,4.7,1,76000


In [None]:
Web_Courses = df[df['title'].str.contains('Web')].sort_values(by=['enrolled'], ascending=False)
# 베스트셀러 저장
Web_BestSellers = Web_Courses[Web_Courses['enrolled']>100000]
Web_BestSellers

Unnamed: 0,title,rating,level,enrolled
850,Using Python to Access Web Data,4.8,2,310000
859,Web Design for Everybody: Basics of Web Develo...,4.7,1,280000
399,"HTML, CSS, and Javascript for Web Developers",4.8,2,240000
701,Responsive Website Development and Design,4.5,1,170000
338,Full Stack Web and Multiplatform Mobile App De...,4.7,3,150000
339,Full-Stack Web Development with React,4.7,3,140000
413,How To Create a Website in a Weekend! (Project...,3.3,2,140000
858,Web Applications for Everybody,4.8,3,120000


### 2. App

In [None]:
# App 강의 확인
df[df['title'].str.contains('App')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
56,Applied Data Science with Python,4.5,3,480000
221,Developing Applications with Google Cloud Plat...,4.7,3,300000
565,Machine Learning Foundations: A Case Study App...,4.6,2,240000
54,Applied Data Science,4.6,1,220000
187,Data Analysis and Presentation Skills: the PwC...,4.6,1,220000
418,IBM Applied AI,4.6,1,220000
57,Applied Machine Learning in Python,4.6,3,150000
338,Full Stack Web and Multiplatform Mobile App De...,4.7,3,150000
858,Web Applications for Everybody,4.8,3,120000
49,Android App Development,4.5,1,120000


In [None]:
App_Courses = df[df['title'].str.contains('App')].sort_values(by=['enrolled'], ascending=False)
# 베스트셀러 저장
App_BestSellers = App_Courses[App_Courses['enrolled']>100000]
App_BestSellers

Unnamed: 0,title,rating,level,enrolled
56,Applied Data Science with Python,4.5,3,480000
221,Developing Applications with Google Cloud Plat...,4.7,3,300000
565,Machine Learning Foundations: A Case Study App...,4.6,2,240000
54,Applied Data Science,4.6,1,220000
187,Data Analysis and Presentation Skills: the PwC...,4.6,1,220000
418,IBM Applied AI,4.6,1,220000
57,Applied Machine Learning in Python,4.6,3,150000
338,Full Stack Web and Multiplatform Mobile App De...,4.7,3,150000
858,Web Applications for Everybody,4.8,3,120000
49,Android App Development,4.5,1,120000


### 3. Algorithm

In [None]:
# Algorithm 데이터 확인
df[df['title'].str.contains('Algorithm')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
37,Algorithmic Toolbox,4.7,3,220000
38,Algorithms,4.8,3,150000
238,"Divide and Conquer, Sorting and Searching, and...",4.8,3,130000
39,Algorithms for Battery Management Systems,4.8,3,14000


In [None]:
Algorithm_Courses = df[df['title'].str.contains('Algorithm')].sort_values(by=['enrolled'], ascending=False)
# 베스트셀러 저장
Algorithm_BestSellers = Algorithm_Courses[Algorithm_Courses['enrolled']>100000]
Algorithm_BestSellers

Unnamed: 0,title,rating,level,enrolled
37,Algorithmic Toolbox,4.7,3,220000
38,Algorithms,4.8,3,150000
238,"Divide and Conquer, Sorting and Searching, and...",4.8,3,130000


### 4. Computer

In [None]:
# Computer 데이터 확인
df[df['title'].str.contains('Computer')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
794,The Bits and Bytes of Computer Networking,4.7,1,130000
103,Build a Modern Computer from First Principles:...,4.9,2,95000
489,Introduction to Discrete Mathematics for Compu...,4.4,1,75000
480,Introduction to Computer Science and Programming,4.3,1,32000
152,Computer Security and Systems Management,4.6,1,27000
19,Accelerated Computer Science Fundamentals,4.7,3,22000
479,Introduction to Computer Programming,4.3,1,20000


In [None]:
Computer_Courses = df[df['title'].str.contains('Computer')].sort_values(by=['enrolled'], ascending=False)
# 베스트셀러 저장
Computer_BestSellers = Computer_Courses[Computer_Courses['enrolled']>100000]
Computer_BestSellers

Unnamed: 0,title,rating,level,enrolled
794,The Bits and Bytes of Computer Networking,4.7,1,130000


### 5. Hacking

In [None]:
# Hacking 데이터 확인
df[df['title'].str.contains('Hacking')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


In [None]:
# 유관 데이터 확인
df[df['title'].str.contains('Security')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
721,Security in Google Cloud Platform,4.7,3,300000
422,IT Security: Defense against the digital dark ...,4.8,1,62000
435,Information Security: Context and Introduction,4.6,1,32000
483,Introduction to Cyber Security,4.7,1,32000
152,Computer Security and Systems Management,4.6,1,27000
624,Networking and Security Architecture with VMwa...,4.8,3,17000
720,Security & Safety Challenges in a Globalized W...,4.7,1,14000
13,AWS Fundamentals: Addressing Security Risk,4.3,1,11000
410,Homeland Security and Cybersecurity,4.7,1,5500
0,(ISC)² Systems Security Certified Practitioner...,4.7,1,5300


In [None]:
Security_Courses = df[df['title'].str.contains('Security')].sort_values(by=['enrolled'], ascending=False)
# 베스트셀러 저장
Security_BestSellers = Security_Courses[Security_Courses['enrolled']>100000]
Security_BestSellers

Unnamed: 0,title,rating,level,enrolled
721,Security in Google Cloud Platform,4.7,3,300000


### 6. Finance

In [None]:
# Finance 데이터 확인
df[df['title'].str.contains('Finance')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
306,Finance & Quantitative Modeling for Analysts,4.5,1,260000
481,Introduction to Corporate Finance,4.6,2,130000
307,Finance for Non-Finance Professionals,4.8,1,69000
165,Corporate Finance Essentials,4.8,2,56000
87,Behavioral Finance,4.4,2,55000
285,Essentials of Corporate Finance,4.6,3,54000
566,Machine Learning and Reinforcement Learning in...,3.7,3,29000
326,Foundational Finance for Strategic Decision Ma...,4.8,1,14000
304,FinTech: Finance Industry Transformation and R...,4.6,1,13000
308,Finance for Non-Financial Managers,4.4,3,12000


In [None]:
Finance_Courses = df[df['title'].str.contains('Finance')].sort_values(by=['enrolled'], ascending=False)
Finance_BestSellers = Finance_Courses[Finance_Courses['enrolled']>100000]
Finance_BestSellers

Unnamed: 0,title,rating,level,enrolled
306,Finance & Quantitative Modeling for Analysts,4.5,1,260000
481,Introduction to Corporate Finance,4.6,2,130000


### 7. Commerce

In [None]:
# Commerce 데이터 확인
df[df['title'].str.contains('Commerce')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


In [None]:
# 유관 데이터 확인
df[df['title'].str.contains('Shopping')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


In [None]:
# 유관 데이터 확인
df[df['title'].str.contains('Business')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
113,Business Foundations,4.7,1,510000
296,Excel to MySQL: Analytic Techniques for Business,4.6,1,490000
109,Business Analytics,4.6,1,280000
291,Excel Skills for Business,4.9,1,240000
260,English for Business and Entrepreneurship,4.8,1,230000
293,Excel Skills for Business: Essentials,4.9,1,200000
121,Business and Financial Modeling,4.5,1,160000
205,Data Warehousing for Business Intelligence,4.5,4,140000
111,Business English Communication Skills,4.7,3,120000
115,Business Statistics and Analysis,4.7,1,110000


In [None]:
Business_Courses = df[df['title'].str.contains('Business')].sort_values(by=['enrolled'], ascending=False)
# 베스트셀러 저장
Business_BestSellers = Business_Courses[Business_Courses['enrolled']>100000]
Business_BestSellers

Unnamed: 0,title,rating,level,enrolled
113,Business Foundations,4.7,1,510000
296,Excel to MySQL: Analytic Techniques for Business,4.6,1,490000
109,Business Analytics,4.6,1,280000
291,Excel Skills for Business,4.9,1,240000
260,English for Business and Entrepreneurship,4.8,1,230000
293,Excel Skills for Business: Essentials,4.9,1,200000
121,Business and Financial Modeling,4.5,1,160000
205,Data Warehousing for Business Intelligence,4.5,4,140000
111,Business English Communication Skills,4.7,3,120000
115,Business Statistics and Analysis,4.7,1,110000


### 8. Neural Network

In [None]:
# Neural Network 데이터 확인
df[df['title'].str.contains('Neural')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
626,Neural Networks and Deep Learning,4.9,3,630000
427,Improving Deep Neural Networks: Hyperparameter...,4.9,1,270000
162,Convolutional Neural Networks,4.9,3,240000
163,Convolutional Neural Networks in TensorFlow,4.7,3,46000


In [None]:
# 유관 데이터 확인
df[df['title'].str.contains('CNN')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


In [None]:
# 유관 데이터 확인
df[df['title'].str.contains('RNN')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled


In [None]:
# 유관 데이터 확인
df[df['title'].str.contains('Reinforcement')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
566,Machine Learning and Reinforcement Learning in...,3.7,3,29000
695,Reinforcement Learning,4.7,3,23000
353,Fundamentals of Reinforcement Learning,4.8,3,22000


In [None]:
# Neural Network 데이터 확인
Neural_Courses = df[df['title'].str.contains('Neural')].sort_values(by=['enrolled'], ascending=False)
# 베스트셀러 저장
Neural_BestSellers = Neural_Courses[Neural_Courses['enrolled']>100000]
Neural_BestSellers

Unnamed: 0,title,rating,level,enrolled
626,Neural Networks and Deep Learning,4.9,3,630000
427,Improving Deep Neural Networks: Hyperparameter...,4.9,1,270000
162,Convolutional Neural Networks,4.9,3,240000


### 9. Machine Learning

In [None]:
# Machine Learning 데이터 확인
df[df['title'].str.contains('Machine')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
563,Machine Learning,4.6,3,290000
565,Machine Learning Foundations: A Case Study App...,4.6,2,240000
763,Structuring Machine Learning Projects,4.8,1,220000
200,Data Science: Statistics and Machine Learning,4.4,3,210000
28,Advanced Machine Learning,4.5,4,190000
57,Applied Machine Learning in Python,4.6,3,150000
595,Mathematics for Machine Learning,4.6,1,150000
520,Introduction to TensorFlow for Artificial Inte...,4.7,3,150000
596,Mathematics for Machine Learning: Linear Algebra,4.7,1,140000
570,Machine Learning with Python,4.7,3,120000


In [None]:
Machine_Courses = df[df['title'].str.contains('Machine')].sort_values(by=['enrolled'], ascending=False)
Machine_BestSellers = Machine_Courses[Machine_Courses['enrolled']>100000]
Machine_BestSellers

Unnamed: 0,title,rating,level,enrolled
563,Machine Learning,4.6,3,290000
565,Machine Learning Foundations: A Case Study App...,4.6,2,240000
763,Structuring Machine Learning Projects,4.8,1,220000
200,Data Science: Statistics and Machine Learning,4.4,3,210000
28,Advanced Machine Learning,4.5,4,190000
57,Applied Machine Learning in Python,4.6,3,150000
595,Mathematics for Machine Learning,4.6,1,150000
520,Introduction to TensorFlow for Artificial Inte...,4.7,3,150000
596,Mathematics for Machine Learning: Linear Algebra,4.7,1,140000
570,Machine Learning with Python,4.7,3,120000


### 10. Deep Learning

In [None]:
# Deep Learning 데이터 확인
df[df['title'].str.contains('Deep')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
211,Deep Learning,4.8,3,690000
626,Neural Networks and Deep Learning,4.9,3,630000
427,Improving Deep Neural Networks: Hyperparameter...,4.9,1,270000
520,Introduction to TensorFlow for Artificial Inte...,4.7,3,150000


In [None]:
# 모두 10만이 넘는 수강생이므로 BestSellers로 저장
Deep_BestSellers = df[df['title'].str.contains('Deep')].sort_values(by=['enrolled'], ascending=False)

# 업데이트 중

### 11. AWS

In [None]:
df[df['title'].str.contains('AWS')].sort_values(by=['enrolled'], ascending=False)

Unnamed: 0,title,rating,level,enrolled
12,AWS Fundamentals,4.6,1,130000
15,AWS Fundamentals: Going Cloud-Native,4.7,1,110000
375,Getting Started with AWS Machine Learning,4.5,3,73000
390,Google Cloud Platform Fundamentals for AWS Pro...,4.7,3,36000
14,AWS Fundamentals: Building Serverless Applicat...,4.7,1,27000
16,AWS Fundamentals: Migrating to the Cloud,4.5,3,13000
13,AWS Fundamentals: Addressing Security Risk,4.3,1,11000


In [None]:
AWS_Courses = df[df['title'].str.contains('AWS')].sort_values(by=['enrolled'], ascending=False)
AWS_BestSellers = AWS_Courses[AWS_Courses['enrolled']>100000]
AWS_BestSellers

Unnamed: 0,title,rating,level,enrolled
12,AWS Fundamentals,4.6,1,130000
15,AWS Fundamentals: Going Cloud-Native,4.7,1,110000
