SMART COURSE SELECTOR

🚀 About

Smart Course Selector is an intelligent course recommendation tool designed to help students of the MSc. Computer Engineering (UDE) program find courses that best match their preferences, interests, and academic goals. By analyzing various factors such as preferred semester, past programming experiences, language proficiency among others, the app provides personalized course recommendations to optimize the learning experience.

Key Features

Personalized Course Suggestions: Tailored recommendations based on your unique learning preferences.
Interest & Concept-Based Selection: Find courses that align with topics and subjects you enjoy.
Data-Driven Insights: Leverages intelligent algorithms to provide meaningful and accurate recommendations.

⚙️ Architecture

📚 Libraries and Algorithms

Core Libraries

Flask: Web framework for serving the application
PyMongo: MongoDB integration for Python
Plotly: Interactive data visualization
Dash: Framework for building analytical web applications
Flask-WTF: Form handling and validation
scikit-learn: Machine learning utilities
NumPy: Numerical computing
python-dotenv: Environment variable management
pickle: Serialization and deserialization of Python objects, used for storing trained models

Recommendation System Architecture

This system implements a content-based recommendation algorithm that suggests courses to students based on their profile data, preferences, and course descriptions.

Core Algorithm: Content-Based Filtering with Word2Vec

The recommendation engine employs Word2Vec models to capture semantic relationships between course descriptions and student preferences. This approach goes beyond simple keyword matching by:

Semantic Understanding: Word2Vec transforms text into vector representations that capture the semantic meaning of words and concepts
Contextual Similarity: Measures how closely course content aligns with student interests and background
Feature Extraction: Converts unstructured text data into meaningful numerical representations

Similarity Metrics and Distance Calculations

Cosine Similarity

Our primary similarity metric for content matching is cosine similarity, which measures the cosine of the angle between two non-zero vectors:

cosine_similarity(A, B) = (A · B) / (||A|| * ||B||)

This algorithm was chosen because:

It effectively captures semantic similarity regardless of magnitude
It works well with high-dimensional sparse vectors typical in text analysis
It produces normalized results between -1 and 1, where 1 represents identical direction
It emphasizes the orientation rather than magnitude of text feature vectors

Euclidean Distance

For certain matching criteria (like math level and time availability), we employ Euclidean distance to measure the direct distance between feature points:

euclidean_distance(A, B) = √∑(Aᵢ - Bᵢ)²

This metric is particularly useful for:

Calculating differences in numeric attributes
Measuring absolute distances in multi-dimensional feature space
Providing intuitive distance measurements for non-textual features

Vector Processing Pipeline

Text Preprocessing: Tokenization, stopword removal, and lemmatization
Vector Embedding: Conversion of processed text to numerical vectors using Word2Vec
Feature Weighting: Applying TF-IDF weighting to emphasize important terms
Dimension Reduction: Optional PCA for feature space optimization
Similarity Computation: Calculating similarity scores using the metrics above
Score Normalization: Scaling scores to a consistent range for comparison
Weighted Aggregation: Combining individual criterion scores into a final recommendation score

Matching Criteria

The system calculates a composite matching score based on multiple factors:

Content Similarity: Semantic matching between student interests and course content (cosine similarity of Word2Vec vectors)
Math Level Compatibility: Alignment between course mathematical requirements and student background (Euclidean distance + Threshold-Based Scoring)
Time Availability Match: Comparison of course workload to student's available time (Normalized Euclidean distance)
Language Proficiency: Match between course language and student's language abilities (Proficiency-Weighted Cosine Similarity)
Programming Requirements: Alignment of student's programming skills with course needs (Multi-dimensional Comparison)

Model Persistence

The trained Word2Vec model is serialized using Python's pickle module for efficient storage and retrieval, allowing for:

Fast loading of pre-trained models without retraining
Consistent semantic relationship calculations across system restarts
Reduced computational overhead during recommendation generation

📸 A Look at the App

Application Flow

Closer look at visualizations

📝 Running the app

To run the app locally, follow these steps:

# Open a terminal (Command Prompt or PowerShell for Windows, Terminal for macOS or Linux)

# Ensure Git is installed
# Visit https://git-scm.com to download and install console Git if not already installed

# Clone the repository
git clone https://github.com/dengwanlin/Course-selection-advisor.git

# Navigate to the project directory
cd Course-selection-advisor

# Install required libraries
pip install -r requirements.txt

# Contact any of the authors for our `.env` file to access the database
# Paste the `.env` file in the root of the  Course-selection-advisor folder

# Run app
python app.py

🤝 Demo Video

[]

Click the image above to watch a demonstration of the Course Recommendation System in action.

🤝 Feedback and Contributions

Important

Whether you have feedback on features, have encountered any bugs, or have suggestions for enhancements, we're eager to hear from you. Your insights help us make the Smart Course Advisor better for students

We appreciate your support and look forward to making our product even better with your help!

👥 Authors

Clement Ankomah - @kojobaffoe011
Shafika Islam - @shafika005
Haihua Wang - @dengwanlin
Hazem Al Massalmeh - @Hazemmasa
Marta Zhao Ladrón de Guevara Cano
Laura María García Pulido

📃 License

Distributed under the MIT License. See LICENSE.txt for more information.

Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
db		db
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
course_analysis.ipynb		course_analysis.ipynb
example.env		example.env
requirements.txt		requirements.txt
similarity_resources.pkl		similarity_resources.pkl
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SMART COURSE SELECTOR

Table of Contents

🚀 About

⚙️ Architecture

📚 Libraries and Algorithms

Core Libraries

Recommendation System Architecture

Core Algorithm: Content-Based Filtering with Word2Vec

Similarity Metrics and Distance Calculations

Cosine Similarity

Euclidean Distance

Vector Processing Pipeline

Matching Criteria

Model Persistence

📸 A Look at the App

Application Flow

Closer look at visualizations

📝 Running the app

🤝 Demo Video

🤝 Feedback and Contributions

👥 Authors

📃 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

dengwanlin/Course-selection-advisor

Folders and files

Latest commit

History

Repository files navigation

SMART COURSE SELECTOR

Table of Contents

🚀 About

⚙️ Architecture

📚 Libraries and Algorithms

Core Libraries

Recommendation System Architecture

Core Algorithm: Content-Based Filtering with Word2Vec

Similarity Metrics and Distance Calculations

Cosine Similarity

Euclidean Distance

Vector Processing Pipeline

Matching Criteria

Model Persistence

📸 A Look at the App

Application Flow

Closer look at visualizations

📝 Running the app

🤝 Demo Video

🤝 Feedback and Contributions

👥 Authors

📃 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages