In [3]:
from IPython.core.display import HTML
def css_styling():
    styles = open("custom.css", "r").read()
    return HTML(styles)
css_styling()

# Capstone Project Instructions

## Objective
The objective of this capstone project is to apply data-driven solutions to a real-world problem. Students will work in groups to develop a comprehensive project deliverable, including a Statement of Work (SOW), a report outlining their data-driven solution, presentation slides, and a final presentation.

## Group Formation
1. Each student should form a group consisting of [insert number of members] members.
2. Groups can be formed based on shared interests, complementary skills, or assigned by the instructor.

## Project Deliverables
1. **Statement of Work (SOW):**
   - The SOW should clearly define the problem statement, project goals, objectives, and deliverables.
   - It should outline the scope of work, including data sources, tools, and techniques that will be used.
   - The SOW should also provide a timeline with key milestones and deadlines.
   
2. **Data-Driven Solution Report:**
   - The report should present the group's approach to solving the problem.
   - It should include a thorough analysis of the data, data preprocessing steps, and any models or algorithms used.
   - Results and findings should be clearly documented, along with any limitations or challenges encountered.
   - The report should demonstrate the effectiveness and potential impact of the data-driven solution.
   
3. **Presentation Slides:**
   - Each group should create visually appealing and informative presentation slides.
   - Slides should provide an overview of the project, including the problem statement, methodology, results, and conclusions.
   - Visualizations, charts, and graphs can be used to enhance the presentation.
   
4. **Final Presentation:**
   - On the last day of the project, each group will deliver a presentation summarizing their work.
   - Presentations should be clear, concise, and engaging, lasting approximately [insert duration] minutes per group.
   - All group members should actively participate in the presentation.
   - The presentation should cover the key aspects of the project, highlighting the data-driven solution and its impact.
   
## Evaluation Criteria
Projects will be evaluated based on the following criteria:
- Clarity and completeness of the SOW.
- Depth and quality of the data-driven solution report.
- Effectiveness and creativity of the presentation slides.
- Coherence and professionalism of the final presentation.
  



# Capstone Project: Fine Foods Reviews Analysis and Sentiment Classification

## Project Description

The Fine Foods Reviews Analysis and Sentiment Classification project aims to analyze the Amazon Fine Food Reviews dataset, which consists of reviews for fine foods from Amazon. The project focuses on exploring the dataset, performing sentiment analysis, and building a machine learning model for sentiment classification. By working on this project, students will gain practical experience in data preprocessing, exploratory analysis, visualization, natural language processing, deep learning, parameter tuning, and Azure ML.

## Dataset

- Amazon Fine Food Reviews dataset
- Number of reviews: 568,454
- Number of users: 256,059
- Number of products: 74,258
- Timespan: Oct 1999 - Oct 2012
- Number of Attributes/Columns in data: 10


## Business Questions

There are many business questions can be answered by analyzing this dataset. Assume that our clients wants to get the answer to the following questions and potential data-driven approach to answer the questions:

- **Question**: How do customers feel about our products and brands?
   - **Answer**: Customer sentiment analysis provides insights into areas of improvement or customer satisfaction.

- **Question**: Can we predict how customers will rate our products based on their reviews?
   - **Answer**: Review rating prediction provides insights into customer preferences and helps assess product performance.

- **Question**: How can we identify emerging trends in customer opinions and forecast future preferences?
   - **Answer**: Trend analysis and forecasting help understand changes in customer preferences and anticipate future trends.

In this project, we will be working on the above quesitons answering the business problems. 

## Common Project Tasks for Students

1. **Data Preprocessing:**
   - Clean the dataset by handling missing values, duplicates, outliers, and irrelevant columns.
   - Perform text preprocessing techniques such as tokenization, lowercasing, and removing stop words and punctuation.

2. **Exploratory Data Analysis:**
   - Use pandas to explore the dataset, understanding its structure, data types, and statistical properties.
   - Visualize the distribution of ratings, helpfulness, and other relevant attributes using appropriate plots and charts.

3. **Sentiment Analysis:**
   - Perform sentiment analysis on the review text using natural language processing techniques.
   - Apply sentiment classification algorithms to classify reviews into positive, negative, or neutral sentiments.
   - Evaluate the performance of the sentiment classification model using appropriate evaluation metrics.

4. **Deep Learning with TensorFlow/Keras:**
   - Introduce deep learning concepts and architectures, such as neural networks and deep neural networks.
   - Implement deep learning models using TensorFlow or Keras.
   - Train and evaluate deep learning models for sentiment analysis using the preprocessed dataset.

5. **Parameter Tuning:**
   - Explore hyperparameter tuning techniques to optimize the performance of the deep learning models.
   - Utilize methods like grid search or random search to find the best hyperparameters.
   - Compare and evaluate the performance of the tuned models against the baseline models.

6. **Integration with Azure ML:**
   - Create an Azure ML workspace and compute instance.
   - Utilize Azure ML to develop and deploy the deep learning models.
   - Train the models using Azure ML's capabilities and evaluate their performance.

7. **Model Evaluation and Interpretation:**
   - Evaluate the performance of the deep learning models and compare them with the baseline models.
   - Interpret the results and analyze the impact of deep learning and parameter tuning on the model's performance.
   - Provide insights and recommendations based on the evaluation.

8. **Visualization with Python's Data Visualization Libraries (Matplotlib, Seaborn, Pandas, etc):**
   - Import the preprocessed data into a data visualization pipeline in a jupyter notebook.
   - Create custom visualizations to showcase the sentiment analysis and deep learning results.
   - Use visualization Libraries, to tell a story with your data to help one walk through the key insights used to draw your conclusions.

9. **Capstone Project Presentation:**
   - Prepare a comprehensive presentation summarizing the findings from the data exploration, sentiment analysis, deep learning, and parameter tuning.
   - Showcase the key insights, trends, and recommendations for improving customer satisfaction based on the results.

By completing these tasks, students will develop practical skills in data preprocessing, exploratory analysis, sentiment analysis, deep learning with TensorFlow or Keras, parameter tuning, Azure ML integration, visualization using python visualization tools, and effective communication of project findings. The capstone project will provide students with hands-on experience in leveraging deep learning techniques, optimizing model


# Deliverables

- Reports of the project
- Presentation