![Imgur](https://imgur.com/Dmyouta.png)

# Musiqani tavfsiya tizimini ishlab chiqish

## 1-QADAM. Ma'lumotlarni o'rganish
## 2-QADAM. Ma'lumotlarni tozalash
## 3-QADAM. Machine learning

## 1-Qadam: Ma'lumotlarni o'rganish

In [1]:
# Kerakli kutubxonalarni o'rnatib oldik
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb

from sklearn.metrics.pairwise import cosine_similarity
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.manifold import TSNE

import warnings
warnings.filterwarnings('ignore')

from typing import List, Dict

In [2]:
# datasetni yuklash
a = pd.read_csv('https://raw.githubusercontent.com/ugis22/music_recommender/master/content%20based%20recommedation%20system/songdata.csv')
mp3 = a.copy()
mp3.head()

Unnamed: 0,artist,song,link,text
0,ABBA,Ahe's My Kind Of Girl,/a/abba/ahes+my+kind+of+girl_20598417.html,"Look at her face, it's a wonderful face \nAnd..."
1,ABBA,"Andante, Andante",/a/abba/andante+andante_20002708.html,"Take it easy with me, please \nTouch me gentl..."
2,ABBA,As Good As New,/a/abba/as+good+as+new_20003033.html,I'll never know why I had to go \nWhy I had t...
3,ABBA,Bang,/a/abba/bang_20598415.html,Making somebody happy is a question of give an...
4,ABBA,Bang-A-Boomerang,/a/abba/bang+a+boomerang_20002668.html,Making somebody happy is a question of give an...


In [3]:
# datasetni o'lchamini ko'rish
mp3.shape

(57650, 4)

In [4]:
# dataset haqida umumiy ma'lumotlar
mp3.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 57650 entries, 0 to 57649
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   artist  57650 non-null  object
 1   song    57650 non-null  object
 2   link    57650 non-null  object
 3   text    57650 non-null  object
dtypes: object(4)
memory usage: 1.8+ MB


In [5]:
# null qiymatlar bor yoki yo'qligini tekshirish
mp3.isnull().sum()


artist    0
song      0
link      0
text      0
dtype: int64

In [6]:
# Biz tepadagi dataframedan taxminiy 5000 tasini ko'chirib olyapmiz.
# va indexni qayta nomlayapmiz
mp3 = mp3.sample(n=5000)
mp3 = mp3.drop('link', axis=1).reset_index(drop=True)
mp3.head()

Unnamed: 0,artist,song,text
0,Widespread Panic,Travelin' Man,Been thinkin' all day \nPackin' my car \nWit...
1,Iron Butterfly,My Mirage,"In my mind I see a mirage on the wall, \nBut ..."
2,Rod Stewart,If Not For You,"If not for you, babe, I couldn't find the door..."
3,Eminem,Cocaine,"[Chorus:] \nGot to have it, \nYeah I made it..."
4,Dream Theater,Beyond This Life,"Headline: ""murder, young girl killed \nDesper..."


In [7]:
# BIz mp3 DF dagi text columnsdan "\n, r' "larni o'rnini bo'shliq bilan to'ldirdik
mp3['text'] = mp3['text'].str.replace(r'\n', '')

In [8]:
# analyzer='word': ustun so'zlardan iboratligini bildiradi
# stop_words='english': ingliz tilidagi to, and, the, is kabi artikillarni olib tashlaydi
tfidf = TfidfVectorizer(analyzer='word', stop_words='english')

In [9]:
lyrics_matrix = tfidf.fit_transform(mp3['text'])

In [10]:
# har bir xatordagi cosinus o'xshashlik formulasi orqali o'xshashlikni hisoblab beradi
cosine_similarities = cosine_similarity(lyrics_matrix)

In [11]:
similarities = {}

In [12]:
for i in range(len(cosine_similarities)):
    # Endi biz har bir elementni kosinus_oʻxshashligi boʻyicha saralaymiz va qoʻshiqlar indekslarini olamiz.
    similar_indices = cosine_similarities[i].argsort()[:-50:-1]
    # Shundan so'ng, biz eng o'xshash 50 ta qo'shiqning har bir nomini o'xshashlikda saqlaymiz.
    # Xuddi shu qo'shiqning birinchisidan tashqari.
    similarities[mp3['song'].iloc[i]] = [(cosine_similarities[i][x], mp3['song'][x], mp3['artist'][x]) for x in similar_indices][1:]


In [29]:
class ContentBasedRecommender:
    def __init__(self, matrix):
        self.matrix_similar = matrix

    def _print_message(self, song, recom_song):
        rec_items = len(recom_song)

        print(f'{song} uchun tanlangan {rec_items} ta musiqa:')
        for i in range(rec_items):
            print(f"{i+1}-musiqa:")
            print(f"{recom_song[i][1]} musiqa {recom_song[i][2]} tomonidan {round(recom_song[i][0], 3)} o'xsashlik bilan")
            print("--------------------")

    def recommend(self, recommendation):
        song = recommendation['song']
        number_songs = recommendation['number_songs']
        recom_song = self.matrix_similar[song][:number_songs]
        self._print_message(song=song, recom_song=recom_song)


In [30]:
recommedations = ContentBasedRecommender(similarities)

In [31]:
recommendation = {
    "song": mp3['song'].iloc[10],
    "number_songs": 4
}

In [32]:
recommedations.recommend(recommendation)

Money Makes Her Smile uchun tanlangan 4 ta musiqa:
1-musiqa:
Money Money Money Shouts musiqa Fabolous tomonidan 0.78 o'xsashlik bilan
--------------------
2-musiqa:
Money musiqa Extreme tomonidan 0.707 o'xsashlik bilan
--------------------
3-musiqa:
Money Makes The World Go Round musiqa R. Kelly tomonidan 0.579 o'xsashlik bilan
--------------------
4-musiqa:
Bet Money musiqa Gucci Mane tomonidan 0.433 o'xsashlik bilan
--------------------
