Hackton : new york city taxi trip duration

ML roadmap : https://www.kaggle.com/discussions/getting-started/211797

키워드 살펴보면 좋을 것 : GitHub - AMAI-GmbH/AI-Expert-Roadmap: Roadmap to becoming an Artificial Intelligence Expert in 2022(https://github.com/AMAI-GmbH/AI-Expert-Roadmap)

강의안 목록입니다.

강의안은 pdf로 구성되어 있거나 실습파일은 .ipynb입니다. 원본 강의안은 공개가 불가능합니다.

00.빅데이터 기초

빅데이터/인공지능 기초에 대해 강의안이 구성되어 있습니다.

01.머신러닝 소개

02. 빅데이터를 위한 수학

선형대수학
다변수 미적분학과 최적화
확률과 통계

03.파이썬 (초급부터 고급 / 약간의 자료구조)

소개
기본
자료형
조건문
반복문
파일 읽기/쓰기
함수
클래스 모듈
예외처리

중간중간에 연습문제가 들어있습니다. 수업시간에만 답안을 공개합니다.

04. Numpy(선형대수학)

list VS numpy
Numpy의 장점
선형대수학 개념 강의 및 실습

05. Pandas(CSV 포맷 처리)

06. Matplotlib(그림 그리기) / seaborn / plotly

============================Machine Learning============================

07. 사이킷런으로 배우는 머신러닝

08. 회귀 분석

회귀 분석이란?
경사하강법 소개 및 증명
회귀의 평가(R^2, adjusted R^2, AIC,BIC)
P-value
Ordinary Least Square 증명 및 Ridge,Lasso,Elastics 증명
Bais VS Variance
데이터 변환
Logistic Regression(증명 및 오즈비 소개)
Possion Regression 소개

Mixture Model 소개

실습 :

Sklearn tutorial with Boston House Dataset -> Kfold도 소개
sklearn tutorial with load_diabetes
sklearn tutorial Wisconsin (diagnostic) dataset
kaggle Titanic dataset

HW : House advanced regression problem

09. 분류

k-nearest neighbors
Naive Bayes
Decison Tree
Random Forest
AdaBoosting
Gradient Boosting
XGboost
LightGBM
Catboost Hyperparamter 자동 : optnua / Imblanced data
Ensemble learning(bagging,boosting,voting,Stacking)

실습 :

Mushroom Classification (https://www.kaggle.com/uciml/mushroom-classification)
Otto Group Product Classification Challenge (https://www.kaggle.com/c/otto-group-product-classification-challenge)
Cardiovascular Disease(https://www.kaggle.com/sulianova/cardiovascular-disease-dataset)
Prudential Life Insurance Assessment(https://www.kaggle.com/c/prudential-life-insurance-assessment)
Imbalanced Data(Credit Card Fraud Detection(https://www.kaggle.com/mlg-ulb/creditcardfraud))

분류-2

Support Vector Machine
Kernel Method

비지도학습

Dimensionality Rediction

Principal component analysis (PCA)
Linear Discriminant Analysis(LDA)
singular value decomposition (SVD)
Non-negative matrix factorization (NMF)

비지도학습

###Clustering

K-nearest neighbors
K-means,K-mediean,k-medoids
Elbow method with k means
Mean Shift
Hierarchical Clustering
Gaussian Mixture Model
DBSCAN(Density Based Spatial Clustering of Applications with Noise)

Text mining

토큰화
Clearning and Normalization
어간 추출(Stemming) and 표제어 추출(Lemmatization)
불용어(StopWord)
정규 표현식(Regular Expression)
정수 인코딩(integer Encoding)
패딩(Padding)
원-핫 인코딩(One-hot encdoing)
데이터의 분리(data split)
한국어 전처리 패키지(Text Proprcessing Tools for Korean Text)
확률론적 언어 모형 / 언어 모델 평가(Perplexity)
BOW(bag of words) / CounterVecorizer
Document-Term Matrix
Sparse matrix(COO,CSR format)
Term Frequenct-Inverse Document Frequency) / 실습 : 20 Newsgroup 분류하기
감성 인식(Sentriment Analysis) / SentiWordNet, VADER / 실습 : IMDB 영황 Review에 대한 긍정/부정 예측 / beautifulSoup / 워드 클라우드 이용
토픽 모델링(LSA(SVD, Truncated SVD)),LSA
문서 군집화
벡터의 유사도
네이버 영화리뷰 감성인식 / kaggle Mercari Price Suggestion Challenge

Hackton : new york city taxi trip duration

git/github ->source Tree / VS code

R로 하는 데이터 분석 / 텍스트 마이닝

시계열 분석 - 수정중(22.09)

통계량,가설검정
확률과정, 시계열 데이터 처리
Autocorrleation, Deterministic/Probabilistic model
t/f 검정, Kullback-Leibeler Divergerence, AIC(Akaike Information Criterion), BIC(Bayesain Information Criterion)
python statsmodels package

============================Deep Learning============================

peceptron
mulit layer perceptron
Convoluiontal Neural Networks
Recurrent Neural Networks
Speech Recongition ->기초적인 것.
Convolutiaonl Neural Neworks advanced (시각인지) -> RCNN/Faster RCNN
Recurrent Neural Networks advanced (언어인지) -> transformer

Name		Name	Last commit message	Last commit date
Latest commit History 253 Commits
ADSP		ADSP
Mixture-Density-model		Mixture-Density-model
R		R
SQL		SQL
Web		Web
algorithm		algorithm
디지털 헬스케어		디지털 헬스케어
딥러닝		딥러닝
머신러닝		머신러닝
빅데이터 기초		빅데이터 기초
빅데이터를 위한 수학		빅데이터를 위한 수학
빅데이터분석기사		빅데이터분석기사
알고리즘		알고리즘
파이썬		파이썬
.gitattributes		.gitattributes
Mysql		Mysql
README.md		README.md
git 명령어 정리.txt		git 명령어 정리.txt

Youngpyoryu/Lecture_Note

Folders and files

Latest commit

History

Repository files navigation

강의안 목록입니다.

00.빅데이터 기초

01.머신러닝 소개

02. 빅데이터를 위한 수학

03.파이썬 (초급부터 고급 / 약간의 자료구조)

04. Numpy(선형대수학)

05. Pandas(CSV 포맷 처리)

06. Matplotlib(그림 그리기) / seaborn / plotly

07. 사이킷런으로 배우는 머신러닝

08. 회귀 분석

실습 :

09. 분류

실습 :

분류-2

비지도학습

Dimensionality Rediction

비지도학습

Text mining

Hackton : new york city taxi trip duration

추천시스템

git/github ->source Tree / VS code

R로 하는 데이터 분석 / 텍스트 마이닝

시계열 분석 - 수정중(22.09)

linux programming

Docker

web programming

About

Resources

Stars

Watchers

Forks

Languages