Skip to content

mooooondh/BookRecommendation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

평점 기반 λ„μ„œ μΆ”μ²œ

λ³Έ λ‚΄μš©μ€ μš”μ•½λ³Έμž…λ‹ˆλ‹€.
μžμ„Έν•œ λ‚΄μš©μ€ λΈ”λ‘œκ·Έμ—μ„œ ν™•μΈν•˜μ‹€ 수 μžˆμŠ΅λ‹ˆλ‹€.
https://w-storage.tistory.com/33

1.데이터 μˆ˜μ§‘

kaggle dataset을 μ‚¬μš©ν–ˆλ‹€. https://www.kaggle.com/bahramjannesarr/goodreads-book-datasets-10m

2.μΆ”μ²œ μ‹œμŠ€ν…œμ˜ μ’…λ₯˜

  1. μ½˜ν…μΈ  기반 필터링

μ‚¬μš©μžκ°€ μ„ ν˜Έν•˜λŠ” μ•„μ΄ν…œμ„ ν™•μΈν•˜κ³  μ•„μ΄ν…œκ³Ό μœ μ‚¬ν•œ νŠΉμ„±μ„ κ°–λŠ” λ‹€λ₯Έ μ•„μ΄ν…œμ„ μΆ”μ²œν•œλ‹€.

  1. μ΅œκ·Όμ ‘ 이웃 ν˜‘μ—… 필터링

졜근점 이웃 ν˜‘μ—… 필터링은 νƒ€κ²Ÿ μ‚¬μš©μžμ™€ μœ μ‚¬λ„κ°€ 높은 λ‹€λ₯Έ μ‚¬μš©μžμ˜ 평을 보고 μΆ”μ²œ 유무λ₯Ό κ²°μ •ν•˜λŠ” μ‹œμŠ€ν…œμ΄λ‹€.

3.λ°μ΄ν„°μˆ˜μ§‘ 및 μ „μ²˜λ¦¬

λ³Έ ν”„λ‘œμ νŠΈμ—μ„œλŠ” user_rating_0_to_1000.csv파일 1κ°œλ§Œμ„ μ‚¬μš©ν–ˆλ‹€.
ν•΄λ‹Ή λ°μ΄ν„°λŠ” 1,000λͺ…μ˜ μ‚¬μš©μž 평점 데이터λ₯Ό κ°–κ³  μžˆλ‹€.

  1. ID
    0~999IDλ₯Ό κ°–λŠ” 1,000λͺ…μ˜ μ‚¬μš©μžκ°€ 쑴재
    μ‚¬μš©μž λΉˆλ„μˆ˜ κ·Έλž˜ν”„

  2. Name
    λ„μ„œλͺ…이닀. μ€‘λ³΅λ˜λŠ” μ΄λ¦„μ˜ λ„μ„œκ°€ μ‘΄μž¬ν•  수 μžˆλ‹€.

  3. Rating
    μ‚¬μš©μžκ°€ λ§€κΈ΄ 평점이닀. 'This user doesn't have any rating', 'did not like it', 'it was ok', 'liked it', 'really liked it', 'it was amazing'둜 κ΅¬μ„±λ˜μ–΄μžˆλ‹€.
    λ¬Έμžμ—΄μ€ μ‚¬μš©ν•˜κΈ° λΆˆνŽΈν•˜κΈ°μ— 각각 0, 1, 2, 3, 4, 5점으둜 μΉ˜ν™˜ν–ˆλ‹€.
    평점 ꡬ성

4.λ„μ„œ μΆ”μ²œ μ•Œκ³ λ¦¬μ¦˜ 개발

  1. 코사인 μœ μ‚¬λ„λ₯Ό μ‚¬μš©ν•˜μ—¬ λ„μ„œλ“€κ°„μ˜ μœ μ‚¬λ„λ₯Ό μΈ‘μ •ν–ˆλ‹€.
  2. μ•„λž˜ 곡식을 μ΄μš©ν•΄ 예츑 평점을 κ³„μ‚°ν–ˆλ‹€.
  3. μˆ˜ν–‰κ²°κ³Ό MSEλŠ” μ•½7.48이 λ‚˜μ™”λ‹€.
  4. λͺ¨λΈ κ°œμ„ μ„ μœ„ν•΄ νŠΉμ • μ•„μ΄ν…œμ„ 비ꡐ할 λ•Œ μœ μ‚¬λ„κ°€ 높은 μ•„μ΄ν…œλ§Œ λΉ„κ΅ν–ˆλ‹€.
  5. κ°œμ„ λœ λͺ¨λΈμ˜ MSEλŠ” μ•½ 3.89κ°€ λ‚˜μ™”λ‹€.
  6. 1번 μœ μ €μ—κ²Œ λ„μ„œ μΆ”μ²œ κ²°κ³Ό

5.κ²°λ‘ 

μ‚¬μš©μžκ°€ λ„μ„œμ— λ§€κΈ΄ 평점을 기반으둜 μΆ”μ²œ μ‹œμŠ€ν…œμ„ λ§Œλ“€μ—ˆλ‹€.
μΆ”μ²œλœ λ„μ„œλ₯Ό μ‹€μ œλ‘œ μ‚¬μš©μžκ°€ μ½μ—ˆλŠ”μ§€, 평점을 λ§€κ²ΌλŠ”μ§€ μ•Œ 수 없기에 MSEλ₯Ό 평가 μ§€ν‘œλ‘œ μ‚Όμ•˜λ‹€.
μž…λ ₯ λ°μ΄ν„°μ˜ 일뢀λ₯Ό μ œκ±°ν•˜μ—¬ μΆ”μ²œ λ„μ„œμ— 제거된 데이터가 μžˆλŠ”μ§€ ν™•μΈν•˜λŠ” λ°©μ‹μœΌλ‘œ 정확도λ₯Ό 계산할 수 μžˆμ„ 것 κ°™λ‹€.
ν–₯ν›„ 행렬뢄해방식, μ½˜ν…μΈ  기반 μΆ”μ²œ μ„œλΉ„μŠ€λ₯Ό μ΄μš©ν•˜λŠ” ν”„λ‘œμ νŠΈλ„ 진행해보고 μ‹Άλ‹€.

6.μ•„μ‰¬μš΄ 점, 더 해봐야 ν•  것듀

  • μƒˆλ‘œμš΄ μ‚¬μš©μž, μƒˆλ‘œμš΄ λ„μ„œκ°€ μΆ”κ°€λœλ‹€λ©΄ 4μ—μ„œ μˆ˜ν–‰ν•œ μž‘μ—…μ„ λ‹€μ‹œ μˆ˜ν–‰ν•΄μ•Ό ν•œλ‹€. μ‚¬μš©μž, λ„μ„œμ˜ μΆ”κ°€, μ œκ±°μ— λ”°λ₯Έ μœ μ—°ν•œ λŒ€μ²˜ 방법은 없을지 고민해보고 μ‹Άλ‹€.
  • ν–‰λ ¬λΆ„ν•΄, μ½˜ν…μΈ κΈ°λ°˜ λ“± λ‹€λ₯Έ μΆ”μ²œ 방식을 μ΄μš©ν–ˆμ„ λ•Œ 차이λ₯Ό 확인해보고 μ‹Άλ‹€.
  • 신경망을 μ΄μš©ν•œ μΆ”μ²œ ν”„λ‘œμ νŠΈλ₯Ό μ§„ν–‰ν•˜κ³  차이점을 확인해보고 μ‹Άλ‹€.

About

User rating-based book recommendation system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors