RL 스터디

스터디 계획서

회차	날짜	시간	학습 계획	비고
1회차 ✱	4/1	2시간	RL 소개 및 MDP 기초 강화학습의 기본 개념과 마르코프 결정 과정(MDP) 학습 Lecture 1 Sutton Ch 1 (Intro), Ch 3 (MDP)
2회차 ✱	4/10	2시간	Tabular MDP Planning & Policy Eval 환경의 모델을 아는 상태에서의 계획 및 정책 평가 Lecture 2, Lecture 3	권정우
3회차 ✱	5/1	2시간	Tabular Control & Model-free RL 환경을 모를 때의 가치 추정 및 Q-Learning 기초 Lecture 4 Sutton Ch 5 (Monte Carlo), Ch 6 (TD)	김상래
4회차 ✱	5/8	2시간	Function Approx & Deep Q-Networks 상태 공간이 클 때의 가치 함수 근사와 DQN 심화 Lecture 5, Lecture 6	이용준
5회차 ✱	5/15	2시간	Policy Gradients 1 Value-based를 넘어선 Policy Gradient와 근사법 Lecture 7 Sutton Ch 9 (On-policy Prediction), Ch 13 (PG)	장우혁
6회차 ✱	5/22	2시간	Policy Gradients 2 (Actor-Critic) Actor-Critic 구조 및 분산 처리 기반 RL Lecture 8 Sutton Ch 10 (On-policy Control)	권정우
7회차 ✱	5/29	2시간	Paper Review Session 1 (option) 프로젝트 기획 발표	전원 참여
8회차 ✱	6/26	2시간	Exploration (탐험 전략) 효율적인 탐험을 위한 Multi-armed Bandits 원리 Lecture 10 Sutton Ch 2 (Multi-armed Bandits)	김상래
9회차 ✱	7/3	2시간	Offline RL & LLM (DPO) 오프라인 데이터셋을 활용한 학습 및 DPO 기법 Lecture 9, Lecture 11	이용준

스터디 규칙 및 목표

스터디 목표

CS234 강의와 Sutton 교재를 기반으로 강화학습의 기초 이론부터 최신 딥러닝/오프라인 RL 기법까지 마스터하고, 각자 관심분야의 강화학습 관련 논문도 스터디 할 수 있도록 한다.

참여 규칙

스터디 담당 교체가 필요하거나, 불가피하게 결석해야 하는 경우 미리 연락합니다.
지각, 결석 등에 관한 규정은 추후 정하도록 합니다.
모든 스터디원은 계획서에 따라 정해진 Lecture를 모임 전까지 수강합니다.
각 스터디의 담당자는 Lecture와 참고 자료(Sutton, 2ed)의 내용을 포함하여 발표를 준비합니다.
발표가 끝난 뒤, 원하는 인원은 각자 풀어온 Assignment에 대해 토의할 수 있는 시간을 가집니다.

레퍼런스 / 자료

Main Material: CS234 (2024 Spring)
(Optional) Assignment: CS234 (2026 Winter)
Supply Material: Reinforcement Learning 2ed : Richard S. Sutton and Andrew G. Barto

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
markdown		markdown
pdf		pdf
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL 스터디

스터디 계획서

스터디 규칙 및 목표

스터디 목표

참여 규칙

레퍼런스 / 자료

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

RL 스터디

스터디 계획서

스터디 규칙 및 목표

스터디 목표

참여 규칙

레퍼런스 / 자료

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages