# Recommendation Using Deep Learning

*This model is designed to generate meaningful numerical representations (embeddings) of online courses by combining both categorical and numerical data. In real-world educational platforms, courses are described using various types of featuresâ€”some are categorical like course name, instructor, and difficulty level, while others are numerical such as duration, ratings, price, and enrollment statistics. To effectively utilize this mixed data in a neural network, categorical variables are first encoded and passed through embedding layers to learn dense vector representations. Numerical features are normalized to ensure uniform scale and improve training efficiency. By integrating these processed inputs and passing them through a series of dense layers, the model learns a compact, 32-dimensional embedding that captures the core attributes of each course. This embedding can be used for tasks like personalized recommendations, similarity detection, or clustering.*

In [26]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Embedding, Dense, Concatenate, Flatten, Dropout
from sklearn.preprocessing import LabelEncoder, MinMaxScaler

In [27]:
df = pd.read_csv('Datasets/online_courses_updated.csv')
df = df.drop(columns=['Unnamed: 0'])
df.head()

Unnamed: 0,user_id,course_id,course_name,instructor,course_duration_hours,certification_offered,difficulty_level,rating,enrollment_numbers,course_price,feedback_score,study_material_available,time_spent_hours,previous_courses_taken,course_images,instructor_images
0,15796,9366,Python for Beginners,Emma Harris,39.1,Yes,Beginner,5.0,21600,317.5,0.797,Yes,17.6,4,https://images.unsplash.com/photo-152637909509...,https://images.pexels.com/photos/712521/pexels...
1,861,1928,Cybersecurity for Professionals,Alexander Young,36.3,Yes,Beginner,4.3,15379,40.99,0.77,Yes,28.97,9,https://images.pexels.com/photos/577585/pexels...,https://images.unsplash.com/photo-150064876779...
2,38159,9541,DevOps and Continuous Deployment,Dr. Mia Walker,13.4,Yes,Beginner,3.9,6431,380.81,0.772,Yes,52.44,4,https://images.pexels.com/photos/270404/pexels...,https://images.pexels.com/photos/733872/pexels...
3,44733,3708,Project Management Fundamentals,Benjamin Lewis,58.3,Yes,Beginner,3.1,48245,342.8,0.969,No,22.29,6,https://images.unsplash.com/photo-157316471371...,https://images.unsplash.com/photo-151908536075...
4,11285,3361,Ethical Hacking Masterclass,Daniel White,30.8,Yes,Beginner,2.8,34556,381.01,0.555,Yes,22.01,5,https://images.unsplash.com/photo-156398676860...,https://images.pexels.com/photos/2379004/pexel...


In [28]:
max_enrollments, min_enrollments = df['enrollment_numbers'].max(), df['enrollment_numbers'].min()
threshold_score = max_enrollments * 0.80
threshold_score

39999.200000000004

In [29]:
df = df[df['enrollment_numbers']>threshold_score]
df.shape

(20041, 16)

In [30]:
original_df = df.copy()

In [31]:
course_le = LabelEncoder()           ## Course Encoder
instructor_le = LabelEncoder()       ## Instructor Encoder
difficulty_le = LabelEncoder()       ## Difficulty Encoder

In [32]:
## Transforming the encoders
df['course_name_enc'] = course_le.fit_transform(df['course_name'])
df['instructor_enc'] = instructor_le.fit_transform(df['instructor'])
df['difficulty_enc'] = difficulty_le.fit_transform(df['difficulty_level'])

In [33]:
## Normalize numeric features
scaler = MinMaxScaler()
num_cols = ['course_duration_hours', 'rating', 'feedback_score',
            'course_price', 'enrollment_numbers', 'time_spent_hours',
            'previous_courses_taken']

In [34]:
## Scaling them
df[num_cols] = scaler.fit_transform(df[num_cols])

In [35]:
X_cat = df[['course_name_enc', 'instructor_enc', 'difficulty_enc']]     ## Categorical Features
X_num = df[num_cols]                                                    ## Numerical Features 

In [36]:
## Inputs
input_course = Input(shape=(1,))
input_instructor = Input(shape=(1,))
input_difficulty = Input(shape=(1,))
input_numeric = Input(shape=(X_num.shape[1],))

In [37]:
## Embeddings
emb_course = Embedding(input_dim=df['course_name_enc'].nunique()+1, output_dim=8, name='emb_course')(input_course)
emb_instr = Embedding(input_dim=df['instructor_enc'].nunique()+1, output_dim=8, name='emb_instr')(input_instructor)
emb_diff = Embedding(input_dim=df['difficulty_enc'].nunique()+1, output_dim=4, name='emb_diff')(input_difficulty)

In [38]:
## Flatten embeddings
flat_course = Flatten()(emb_course)
flat_instr = Flatten()(emb_instr)
flat_diff = Flatten()(emb_diff)

In [39]:
x = Concatenate()([flat_course, flat_instr, flat_diff, input_numeric])
x = Dense(128, activation='relu')(x)
x = Dropout(0.3)(x)
x = Dense(64, activation='relu')(x)

In [40]:
embedding_output = Dense(32, activation='relu', name='final_embedding')(x)
model = Model(inputs=[input_course, input_instructor, input_difficulty, input_numeric], outputs=embedding_output)

In [41]:
# Get embeddings for all courses
course_embeddings = model.predict([
    df['course_name_enc'],
    df['instructor_enc'],
    df['difficulty_enc'],
    df[num_cols]
], verbose=0)

In [42]:
course_embeddings.shape

(20041, 32)

In [43]:
from sklearn.metrics.pairwise import cosine_similarity
def recommend_dl(course_name, instructor_name, top_n=6):
    try:
        course_idx = df[
            (original_df['course_name'] == course_name) &
            (original_df['instructor'] == instructor_name)
        ].index[0]
    except IndexError:
        return "Course not found."

    input_vec = course_embeddings[course_idx].reshape(1, -1)
    sims = cosine_similarity(input_vec, course_embeddings).flatten()

    similar_idxs = np.argsort(sims)[::-1]
    similar_idxs = [i for i in similar_idxs if i != course_idx][:top_n]

    final_df = original_df.iloc[similar_idxs][['course_name', 'instructor', 'rating', 'course_images', 'instructor_images']].reset_index()
    final_df = final_df.drop(columns=['index'])
    return final_df

In [44]:
recommends = recommend_dl("Advanced Machine Learning", "Liam Adams", top_n=6)
recommends

Unnamed: 0,course_name,instructor,rating,course_images,instructor_images
0,Mobile App Development with Swift,Dr. Mia Walker,4.7,https://images.unsplash.com/photo-163335612254...,https://images.pexels.com/photos/733872/pexels...
1,Public Speaking Mastery,Olivia Taylor,4.9,https://images.unsplash.com/photo-143154001516...,https://images.pexels.com/photos/1326946/pexel...
2,Graphic Design with Canva,Michael Brown,5.0,https://images.unsplash.com/photo-1547658719-d...,https://images.unsplash.com/photo-1560250097-0...
3,Photography and Video Editing,Sarah Lee,4.2,https://images.unsplash.com/photo-151603506937...,https://images.unsplash.com/photo-152450438894...
4,Python for Beginners,Benjamin Lewis,4.3,https://images.unsplash.com/photo-152637909509...,https://images.unsplash.com/photo-151908536075...
5,Python for Beginners,Charlotte King,4.6,https://images.unsplash.com/photo-152637909509...,https://images.pexels.com/photos/774909/pexels...
