# Top 5 Flicks - Movie Recommendation System
- Author: Jesse Moore
- Phase 4
- Instructor - Mark Barbour
- Blog Post: 

![image](images/picking_movie.jpeg)

# Summary

Our project aims to develop a recommendation system for a new streaming service, providing users with five personalized movie recommendations. The model that we developed uses both K-nearest Neighbors (KNN) and Alternating Least Squares modeling to create our recommendation system and to provide users with 5 recommendations. 

Business and Data Understanding:
We utilized the MovieLens dataset, which contains 100,000 ratings and 3,600 tags, making it well-suited for training a recommendation system. This dataset provides rich user-movie interactions and metadata, essential for building effective recommendations.

Data Preparation:
For our modeling, the data required little / no cleaning or preparation aside from merging our ratings data with our movie data. 

Libraries: 
Aside from the standard libraries (numpy, pandas, random, matplotlib.pyplot, seaborn, warnings), we used the following libraries: zipfile for extracting our data; scipy for csr_matrices, scikit-learn for model selection, label encoding, and evaluating cosine similarity; pyspark for Alternate Least Squares (ALS) modelling; and Surprise for our K-nearest Neighbors modelling. 

Modeling:
We initially implemented an Alternating Least Squares (ALS) model using Spark for collaborative filtering. ALS decomposes the user-item matrix to uncover latent factors, optimized via hyperparameter tuning (rank, maxIter, and RegParam). We then introduced a K-Nearest Neighbors (KNN) baseline model using the Surprise package, tuning for optimal k and min_k values. This approach strategizes to combat 'cold-start' problems where users have little to no rating history, or when movies have little to no ratings. 

Evaluation:
Performance was assessed using Root Mean Squared Error (RMSE), where a lower RMSE indicates better predictive accuracy. Grid search tuning found the best model to be KNN Baseline (k=56, min_k=14) with an RMSE of 0.886, slightly outperforming the ALS baseline (RMSE 0.888). ALS remains valuable for handling cold start issues.

Insights and Limitations:
Additional user behavior tracking—such as viewing duration and repeated watches—could improve recommendations. Incorporating explicit user preferences, such as favorite genres and actors, would further refine personalization. Future iterations may explore hybrid deep learning models for enhanced performance.

# Overview

## Business Understanding

### Problem Statement

Our client is launching a new streaming service and wants to implement a recommendation system that provides users with five personalized movie recommendations.

To achieve this, we will use a hybrid approach, combining collaborative filtering and content-based filtering. This will help mitigate the cold start problem, which occurs when:

New users, who haven’t rated any content, receive poor or no recommendations.
Movies with few or no ratings are unlikely to be recommended.
Additionally, we have been tasked with delivering insights into user engagement, enabling our client to maximize engagement with their growing user base.

Evaluation Metrics
To assess the effectiveness of our recommendation system, we will use the following key metrics:

Cosine Similarity – Measures the similarity between movies based on content features. It calculates the angle between feature vectors, helping identify movies with similar characteristics.

Root Mean Squared Error (RMSE) – Evaluates the accuracy of our rating predictions by measuring the average difference between actual and predicted ratings. A lower RMSE indicates better prediction performance.
By leveraging this hybrid approach and these evaluation metrics, we aim to build a recommendation system that delivers high-quality suggestions while providing valuable insights for our client.

These concepts will be explained in further detail below.

### Business Objective


Our business objective is to enhance user enjoyment and engagement with our client's streaming service.

To achieve this, we will recommend five movies to each user that they are likely to enjoy. Success will be measured by ensuring that our recommendations receive higher rating scores compared to a baseline model.

### Stakeholder Questions

User Behavior & Engagement
*What types of movies (genres, ratings, release years) are most frequently watched?
*Are there specific times or days when users are more active on the platform?

Content Optimization
What are the most popular movies among different audience segments?
Are there under-watched but highly-rated movies that we should promote?
What content categories (e.g., drama, comedy, action) drive the most engagement?
Is there a difference in engagement between older classics and new releases?


Personalization & Recommendations
What patterns exist in user ratings that can inform better recommendations?
Can we predict what a user might watch next based on their history?
Should we prioritize recommending high-rated content or content that aligns with past behavior?

# Data

## Data Understanding

### Data Source

Our dataset is the [MovieLens](https://grouplens.org/datasets/movielens/latest/) dataset that has been created by the GroupLens research lab at the University of Minnesota. This dataset contains 100,000 ratings, 3,600 tags and was last updated on 9/2018. While the dataset contains several csv files of data, such as 'tags.csv' (which contains user created tags for movies) and 'link.csv' (which contains keys to link our data with imdb and tmdb information), we will not be utilizing this data, instead focusing on the 'movies.csv' and 'ratings.csv' datafiles.

### Data Description

movies.csv - contains the nearly 10,000 movies that have been rated, their title and genre.

- movieId - the Id of the movie, this will used to be merge this information with other data. 
- title - the title of the movie, also generally contains the year of the movie.
- genres -  the genres that the movie comprises. This data will be transformed later, to split genre tags such as 'Action/Adventure/Animation' into their individual components, and will be explained later in this notebook. 

ratings.csv - includes the over 100,000 user ratings.

- userId - the Id of the user who left the rating, this will be used to merge with other data.
- movieId - the Id of the movie, we will use this Id to merge with other data.
- rating - the rating, ranging from 0.5 to 5.0.
- timestamp - the time when the rating was left. 

In [None]:
# Loading the Data
# Suppress warnings
import warnings

# Standard Libraries
import numpy as np
import pandas as pd
import random

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Data Processing & Utilities
from zipfile import ZipFile
from scipy.sparse import csr_matrix

# Machine Learning & Model Evaluation
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics.pairwise import cosine_similarity


# Apache Spark for Large-Scale Processing
from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.sql.functions import explode, rank, log2

# Spark Machine Learning (ALS Recommender System)
from pyspark.ml.recommendation import ALS
from pyspark.ml.evaluation import RegressionEvaluator
from pyspark.ml.tuning import TrainValidationSplit, ParamGridBuilder

# Surprise Library for Collaborative Filtering
from surprise import Dataset, Reader
from surprise.model_selection import train_test_split, cross_validate, GridSearchCV
from surprise.prediction_algorithms import KNNBaseline

%matplotlib inline

In [None]:
# Quality of Life (QoL) Enhancements for Reproducibility, Plotting, and Warnings

# Set a random seed for reproducibility across random number generators
random_seed = 42  
np.random.seed(random_seed)  
random.seed(random_seed)  

# Set plotting style for better visuals in charts
plt.style.use('seaborn-v0_8-darkgrid')  

# Suppress all warnings to avoid clutter in output
warnings.filterwarnings('ignore')  

# Disable warnings for chained assignments in pandas (prevents 'SettingWithCopyWarning')
pd.options.mode.chained_assignment = None  

In [None]:
# Loading MovieLens Database and Extracting Data

# Define file paths
zip_file_path = 'data/ml-latest-small.zip'  
extract_folder = 'data/'

# Extract the dataset from the zip file
with ZipFile(zip_file_path, 'r') as zip_ref:  
    zip_ref.extractall(extract_folder)  

# Load the datasets into pandas DataFrames
datasets = {
    'movies': 'movies.csv', 
    'ratings': 'ratings.csv'
}

# Using a loop to load all datasets at once
dataframes = {}
for key, file_name in datasets.items():
    dataframes[key] = pd.read_csv(f'data/ml-latest-small/{file_name}', encoding='utf-8')

# Access individual dataframes as needed, for example:
movies_df = dataframes['movies']
ratings_df = dataframes['ratings']

### Data Cleaning

There was little data cleaning that was done in this notebook, as we only utilized the movies.csv and ratings.csv for our modeling. This data was organized and needed no cleaning. (Note - in our exploratory notebook we have done more extensive cleaning that can be used to implement further testing in the future, see Limitations). 

### Data Preparation

There was little data cleaning that was done in this notebook, as we only utilized the movies.csv and ratings.csv for our modeling. We chose to In future implementations of our model, removing users who have left too few ratings, or movies that have too few ratings can considered to be removed from our modelling data.

### Matrix Sparsity Analysis

Before we can build our hybrid recommender system, we must analyze the matrix sparsity to ensure that the majority of our users have rated more than very few items. While we will build a hybrid recommender system that factors for cold-start issues, we must also contend with a model that will not be able to generalize predictions if there is too much sparsity. Too little, and we may have bias in our dataset. 

Sparsity is calculated as: sparsity = 1 - no# of non-zero interactions / total possible interactions

In [None]:
user_encoder = LabelEncoder()
item_encoder = LabelEncoder()

In [None]:
ratings_df['userId_x'] = user_encoder.fit_transform(ratings_df['userId'])
ratings_df['movieId_x'] = item_encoder.fit_transform(ratings_df['movieId'])

In [None]:
num_users = ratings_df['userId_x'].nunique()
num_movies = ratings_df['movieId_x'].nunique()

In [None]:
interaction_matrix = csr_matrix(
    (ratings_df['rating'], (ratings_df['userId_x'], ratings_df['movieId_x'])),
    shape=(num_users, num_movies)
)

In [None]:
print(f'Created sparse matrix with shape: {interaction_matrix.shape}')

In [None]:
num_interactions = interaction_matrix.nnz
sparsity = 1 - (num_interactions / (num_users * num_movies))

In [None]:
print(f'Sparsity of the interaction matrix: {sparsity:.2%}')

The results of our matrix sparsity analysis indicates that are dataset is a highly sparse dataset with shape: (610, 9724) and the sparsity of the interaction matrix is 98.30%. Because of the highly sparse nature of our dataset, and because we have explicit ratings, we will utilize a hybrid recommendation system using both KNN and ALS models. 

In [None]:
user_interactions = np.array(interaction_matrix.sum(axis=1)).flatten()

In [None]:
plt.hist(user_interactions, bins=20, log=True, color='blue', alpha=0.7)
plt.xlabel('Number of Interactions per User')
plt.ylabel('Frequency (log scale)')
plt.title('User Interaction Distribution');

In [None]:
item_interactions = np.array(interaction_matrix.sum(axis=0)).flatten()

In [None]:
plt.hist(item_interactions, bins=50, log=True, color='blue', alpha=0.7)
plt.xlabel('Number of Interactions per Item')
plt.ylabel('Frequency (log scale)')
plt.title('Item Interaction Distribution');

This 'long-tail' distribution shows us that the majority of our movies have fewer or no ratings/interactions, while a small amount of popular have a large number of interactions. This can lead to cold-start problems, when the system fails to recommend an item because it has too little information to go on. We will use a hybrid recommendation system to account for this.

In [None]:
sample_users = 200
sample_items = 200

In [None]:
sns.heatmap(interaction_matrix[:sample_users, :sample_items].toarray(), cmap='YlGnBu', cbar=False)
plt.xlabel('Items')
plt.ylabel('Users')
plt.title('Interaction Matrix Heatmap (Subset)');

Since the majority of our heatmap shows off-white, this represents 0 or near-zero values in our dataset, while the darker zones represent larger values. This indicates a highly sparse dataset. 

In [None]:
cold_users = np.sum(interaction_matrix, axis=1) == 0
cold_items = np.sum(interaction_matrix, axis=0) == 0

In [None]:
print(f'Cold-start users: {np.sum(cold_users)} / {interaction_matrix.shape[0]}')
print(f'Cold-start items: {np.sum(cold_items)} / {interaction_matrix.shape[1]}')

This shows us that, currently, every user has interacted with at least one item, and every item has been interacted with by at least one user. 

# Modeling

We began with Alternating Least Squares (ALS) using Apache Spark as our baseline model. ALS is a collaborative filtering method that factorizes the user-item matrix into two smaller matrices, capturing latent factors of users and items. The model iteratively optimizes one matrix while fixing the other, minimizing the least squared error. We tuned the following hyperparameters:

* rank (number of latent factors),
* maxIter (maximum iterations for optimization),
* RegParam (regularization to prevent overfitting).

While ALS effectively handles the cold-start problem by learning from implicit feedback, it struggled with accuracy for well-established users with substantial rating histories.

To improve recommendations, we implemented a K-Nearest Neighbors (KNN) Baseline model using Surprise. KNN finds the K most similar users or items and predicts ratings based on their aggregated preferences. We tuned:

* k (number of neighbors considered),
* min_k (minimum neighbors required for aggregation).

To optimize both models, we performed grid search, systematically testing hyperparameter combinations to find the best-performing configuration. Our final KNN Baseline model (with k=56 and min_k=14) achieved an RMSE of 0.886, outperforming ALS (RMSE = 0.888). The improvement, while marginal, suggests KNN's advantage in leveraging explicit user rating similarities, making it more effective for established users.

However, KNN struggles with sparsity, making ALS a valuable fallback for new users or unrated items. A hybrid system leveraging both models balances accuracy and cold-start mitigation.

# Alternating Least Squares

In [None]:
# Start a Spark session
spark = SparkSession.builder.appName('MoieRecommender').getOrCreate()
spark.sparkContext.setLogLevel("ERROR") 

In [None]:
#loading our dataset using spark.
spark_df = spark.read.csv('data/ml-latest-small/ratings.csv', header=True, inferSchema=True)

In [None]:
# Create the validation, training and hold out test sets.
# creating training and validation sets
train, validation, hold_out = spark_df.randomSplit([0.64, 0.16, 0.2], seed=random_seed)

In [None]:
# Defining our ALS (Spark) model.
als = ALS(
    userCol='userId',
    itemCol='movieId',
    ratingCol='rating',
    coldStartStrategy='drop',
    nonnegative= True
)

In [None]:
# Creating our parameter grid for our ALS spark model
param_grid = ParamGridBuilder()\
    .addGrid(als.rank, [12, 13, 14])\
    .addGrid(als.maxIter, [18, 19, 20])\
    .addGrid(als.regParam, [.17, .18, .19])\
    .build()

In [None]:
# Creating our Regression Evaluator using RMSE as our metric.

reg_evaluator = RegressionEvaluator(
    metricName='rmse',
    labelCol ='rating',
    predictionCol='prediction'
)

In [None]:
# Build cross validation using Train Validation Split

train_val_split = TrainValidationSplit(
    estimator=als,
    estimatorParamMaps=param_grid,
    evaluator= reg_evaluator)

In [None]:
# Training our ALS model
spark_model = train_val_split.fit(train)

In [None]:
# Extracting the best model from the tuning process using ParamGridBuilder
best_model = spark_model.bestModel

In [None]:
# Generate predictions
predictions = spark_model.transform(validation)

In [None]:
# Generate RMSE using our prediction and reg_evaluator
rmse = reg_evaluator.evaluate(predictions)

In [None]:
print(f"""  **BEST MODEL**
  RSME = {rmse}
  Hyperparameters:
  Rank: {best_model.rank}
  Max Iterations: {best_model._java_obj.parent().getMaxIter()}
  Regularization Parameter: {best_model._java_obj.parent().getRegParam()}
  Nonnegative: {best_model._java_obj.parent().getNonnegative()}
  Cold Start Strategy: {best_model._java_obj.parent().getColdStartStrategy()}
""")

In [None]:
als = ALS(
    userCol='userId',
    itemCol='movieId',
    ratingCol='rating',
    coldStartStrategy='drop',
    nonnegative= True,
    rank= 30,
    maxIter= 30,
    regParam= .18
)

In [None]:
# Creating our parameter grid for our ALS spark model
param_grid = ParamGridBuilder()\
    .addGrid(als.rank, [20, 30, 40])\
    .addGrid(als.maxIter, [5, 10, 15])\
    .addGrid(als.regParam, [.10, .20, .30])\
    .build()

In [None]:
# Build cross validation using Train Validation Split

train_val_split = TrainValidationSplit(
    estimator=als,
    estimatorParamMaps=param_grid,
    evaluator= reg_evaluator)

In [None]:
# Training our ALS model
spark_model = train_val_split.fit(train)

In [None]:
# Extracting the best model from the tuning process using ParamGridBuilder
best_model = spark_model.bestModel

In [None]:
# Generate predictions
predictions = spark_model.transform(validation)

# Generate RMSE using our prediction and reg_evaluator
rmse = reg_evaluator.evaluate(predictions)

# Printing our evaluation metrics and model parameters
# Printing our evaluation metrics and model parameters
print(f"""  RSME = {rmse} of current model.
  **BEST MODEL**
  Hyperparameters:
  Rank: {best_model.rank}
  Max Iterations: {best_model._java_obj.parent().getMaxIter()}
  Regularization Parameter: {best_model._java_obj.parent().getRegParam()}
  Nonnegative: {best_model._java_obj.parent().getNonnegative()}
  Cold Start Strategy: {best_model._java_obj.parent().getColdStartStrategy()}
""")

## K-nearest Neighbors

In [None]:
reader = Reader(rating_scale=(1, 5))

expected_column_names = ["userId", "movieId", "rating"]

# Load the data into a Surprise Dataset
data_surp = Dataset.load_from_df(ratings_df[expected_column_names], reader)

In [None]:
# cross validating with KNNBaseline
knn_baseline = KNNBaseline(sim_options={'name':'pearson', 'user_based':True})
cv_knn_baseline = cross_validate(knn_baseline, data_surp, n_jobs=-1)

In [None]:
for i in cv_knn_baseline.items():
    print(i)
    
print('-----------------------')
# print validation results
np.mean(cv_knn_baseline['test_rmse'])

In [None]:
# OK it seems that our best rmse scores come from the knn_baselin model with the hyperparameter values k=56 and min_k=14 so tha is what we will use as our final model.
# cross validating with KNNBaseline
final_knn_baseline = KNNBaseline(k=56, min_k=14, sim_options={'name':'pearson', 'user_based':True})
final_cv_knn_baseline = cross_validate(final_knn_baseline, data_surp, n_jobs=-1)

for i in cv_knn_baseline.items():
    print(i)
    
print('-----------------------')
# print validation results
np.mean(final_cv_knn_baseline['test_rmse'])

## Evaluation 

In [None]:
# It seems like our second best scores are with our ALS model, which will help address our issues of cold start. 
# Defining our ALS (Spark) model.
als = ALS(
    userCol='userId',
    itemCol='movieId',
    ratingCol='rating',
    coldStartStrategy='drop',
    nonnegative= True
)

# Creating our parameter grid for our ALS spark model
param_grid = ParamGridBuilder()\
    .addGrid(als.rank, [12, 13, 14])\
    .addGrid(als.maxIter, [18, 19, 20])\
    .addGrid(als.regParam, [.17, .18, .19])\
    .build()

# Creating our Regression Evaluator using RMSE as our metric.

reg_evaluator = RegressionEvaluator(
    metricName='rmse',
    labelCol ='rating',
    predictionCol='prediction'
)

# Build cross validation using Train Validation Split

train_val_split = TrainValidationSplit(
    estimator=als,
    estimatorParamMaps=param_grid,
    evaluator= reg_evaluator)

# Training our ALS model
final_spark_model = train_val_split.fit(train)

# Extracting the best model from the tuning process using ParamGridBuilder
best_model = final_spark_model.bestModel

# Generate predictions
hold_out_predictions = final_spark_model.transform(hold_out)

# Generate RMSE using our prediction and reg_evaluator
hold_out_rmse = reg_evaluator.evaluate(hold_out_predictions)

print(f"""  RSME = {hold_out_rmse} of current model.
  **HOLD OUT MODEL**
  Hyperparameters:
  Rank: {best_model.rank}
  Max Iterations: {best_model._java_obj.parent().getMaxIter()}
  Regularization Parameter: {best_model._java_obj.parent().getRegParam()}
  Nonnegative: {best_model._java_obj.parent().getNonnegative()}
  Cold Start Strategy: {best_model._java_obj.parent().getColdStartStrategy()}
""")

In [None]:
# Assuming 'ratings_df' is your pandas DataFrame
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(ratings_df[['userId', 'movieId', 'rating']], reader)

# Create the trainset
trainset = data.build_full_trainset()

# Fit the final KNNBaseline model on the trainset
final_knn_baseline = KNNBaseline(k=56, min_k=14, sim_options={'name': 'pearson', 'user_based': True})
final_knn_baseline.fit(trainset)

for i in cv_knn_baseline.items():
    print(i)
    
print('-----------------------')
# print validation results
np.mean(final_cv_knn_baseline['test_rmse'])

In [None]:
# Creating a function to get movie recommendations so that we can evaluate our predictions as we develop our model.

def get_movie_recommendations(userId=None, k=5):
    if userId is None:
        # Cold Start: Use ALS model
        als_recommendations = best_model.recommendForAllUsers(k)

        # Extract top k recommendations
        top_recommendations = als_recommendations.select('userId', F.explode('recommendations').alias('recommendation')) \
                                                 .select('userId', 'recommendation.movieId', 'recommendation.rating')

        # Convert to Pandas
        top_recommendations = top_recommendations.toPandas()

        # Join with movies_df for titles
        top_recommendations = top_recommendations.merge(movies_df, on="movieId", how="left")

        return top_recommendations[['movieId', 'title', 'rating']].sort_values(by='rating', ascending=False).head(k)

    else:
        # Known User: Use KNN model
        neighbors = final_knn_baseline.get_neighbors(trainset.to_inner_uid(userId), k)
        
        # Map indices to movie IDs
        movie_ids = [trainset.to_raw_iid(i) for i in neighbors]

        # Predict ratings
        predicted_ratings = [final_knn_baseline.predict(userId, movie_id).est for movie_id in movie_ids]

        # Create DataFrame
        recommendations_df = pd.DataFrame({'movieId': movie_ids, 'predicted_rating': predicted_ratings})

        # Join with movies_df for titles
        recommendations_df = recommendations_df.merge(movies_df, on="movieId", how="left")

        # Get actual ratings (if available)
        actual_ratings = ratings_df[ratings_df['userId'] == userId][['movieId', 'rating']].rename(columns={'rating': 'actual_rating'})
        
        # Merge actual ratings
        recommendations_df = recommendations_df.merge(actual_ratings, on="movieId", how="left")

        # Calculate prediction error
        recommendations_df['error'] = abs(recommendations_df['predicted_rating'] - recommendations_df['actual_rating'])

        # **Ensure cosine similarity column exists**
        recommendations_df['cosine_similarity'] = None  # Default value

        # Compute cosine similarity if no actual ratings exist
        if recommendations_df['actual_rating'].isnull().all():
            similarities = []
            for movie_id in movie_ids:
                inner_id = trainset.to_inner_iid(movie_id)
                sim_scores = final_knn_baseline.sim[inner_id]
                mean_similarity = sim_scores.mean()  # Get mean similarity
                similarities.append(mean_similarity)

            recommendations_df['cosine_similarity'] = similarities

        # **Ensure column exists before selection**
        columns_to_display = ['movieId', 'title', 'predicted_rating', 'actual_rating', 'error']
        if 'cosine_similarity' in recommendations_df.columns:
            columns_to_display.append('cosine_similarity')

        # Display top k recommendations
        return recommendations_df[columns_to_display].sort_values(by='predicted_rating', ascending=False).head(k)

# Example usage:
print(get_movie_recommendations(userId=40))  # For a known user
print(get_movie_recommendations(userId=None))  # For cold start scenario

### Our hybrid recommendation system works, with our KNN model providing an RMSE score of: 0.8649660909040028
### while our ALS model has a score of: 0.8863465869481534

When a user has a sufficient user rating history, this model on average provides movies recommended to users within .88 of their predicted rating, however, in cases where the user has little to no user rating history or a movie no too few ratings, we cannot accurately predict what their rating will be. In such cases, our ALS model will be used to recommend movies based upon latent factors. 

During model development, we identified several data limitations that impact recommendation accuracy, particularly in addressing the cold-start problem. Additional user information—such as favorite genres, actors, or directors—could improve recommendations for new users. Tracking unrated movies, watch duration (e.g., whether a user finished a movie or stopped after a few minutes), and rewatch frequency would provide deeper insights into viewing habits. Incorporating these behavioral signals could enhance prediction accuracy and better capture user preferences beyond explicit ratings.

## Conclusion

Implementing a hybrid recommendation system, we combined the strengths of ALS and KNN Baseline to deliver personalized movie suggestions while addressing the cold-start problem. Our grid search optimization identified KNN as the best-performing model (RMSE = 0.886), outperforming the ALS baseline (RMSE = 0.888). While KNN excels for users with rich rating histories, ALS remains essential for recommending movies to new users or sparsely rated items.

Further improvements could be made by incorporating additional user behavior data—such as watch duration, rewatch frequency, and unrated views—to refine predictions. With this hybrid approach, our client can maximize user engagement and deliver high-quality recommendations as their streaming platform grows.

For further details, please refer to the following linked project notebook and presentation:

project notebook presentation