Skip to content

Latest commit

 

History

History
205 lines (128 loc) · 10.3 KB

syllabus.md

File metadata and controls

205 lines (128 loc) · 10.3 KB

Data Science

Course Overview

Objectives

The primary goal for this Data Science class is to collect (acquire), store, analyze and visualize data for answering questions pertaining to the third order of knowledge.

Students will perform programming parts throughout the quarter to finish the final course project that achieve above requirements.

Expectations

We expect students to have strong Java programming background with ability to manage your own programming environment such as installing new tools and maintaining your tools.

In addition, we also expect students to spend a large amount of time learning new technologies and coding outside of course schedule.

Finally, the programming assignments are non-trivial for a number of reasons, students are expected to understand and develop programs that have varying algorithmic complexities. To this end, leaving the homework assignments to the last minute is a surefire way to not pass the course.

Logistics

Instructor-in-charge:

Course assistant:

Schedule:

  • Undergraduate: Sunday 9:10 to 13:00 at ET-A309
  • Graduate: Sunday 13:10 to 17:00 at BIOS-144

Office hours:

  • Sunday 17:00 to 19:00 at BIOS-144
  • Online via gitter as available (usually 24/7)

Please do ask the question early and often than leaving questions at the last moment (e.g. 1am before the homework is due). Although instructors are generally available for most of time, we are not obligated to keep up with all last moment questions with students right before due date.

Textbook:

Students do not need to purchase a textbook for this class as this Github will provide all course materials.

Computer:

Students are required to have a laptop computer for the course. On some assignments and certainly for the final project where a live demo is required, it is unreasonable to coordinate with anything other than a laptop.

We do not endorse a particular platform or OS; however, we encourage students to pick something that has reasonable computing horsepower. For this reason, Chromebooks or tablets are not suitable for serious software development and data processing.

Course Objectives

  • Collect data
    • Java crawler/collector
    • [Optional] Akka framework
  • Store big data
    • MongoDB
    • Elastic Search
    • [Optional] HDFS
  • Analyze big data
    • Elastic Search query
    • Python
    • [Optional] Hadoop MR (Map reduce)
  • Visualize data
    • Basic JavaScript, HTML & CSS
    • D3.js

Grading Policy

The purpose of this grading policy is to fairly evaluate student's performance. Please review them carefully.

Grading Allocation

  • Homework - 40 points
  • Quizzes - 40 points
  • Project - 20 points
  • Attendance & decorum - 5 points (more below)

Grading Scale

The final grade will be adjusted to a 100 point scale as followed.

  • A: 94 to 100
  • A-: 90 to 93
  • B+: 85 to 89
  • B: 80 to 84

Graduate students are required to get a grade 80 or above to pass the course

  • B-: 77 to 79
  • C+: 74 to 76
  • C: 70 to 73

Undergraduate students are required to obtain a grade 70 or above to pass the course

  • F: 0 to 69

Successful completion of the course project is required for passing this course.

Course Decorum

We expect you to show civility and concerns for your classmates. We expect for you to approach the class with a positive attitude and professional demeanor.

This includes remaining alert (and awake!) in class, respecting and never interrupting others, limiting private conversations, and keeping phones off.

Because the class is a fast-paced and demanding class, there will be opportunities for overwhelmed feeling. The best way to address these challenges is to discuss the course on gitter and with the instructors. Please ask for help often and early.

The instructors are obligated to listen to the students' concerns and to treat each student with the needed respect and dignity. However, the instructors are not obligated to honor any and all requests. Especially in those cases when students ask for assistance during the 11th hour.

Request for a Regrade

In those cases of arithmetic errors, please bring the issue to the instructor-in-charge immediately. We will make the necessary adjustments.

In those cases where students believe that the grade rendered does not reflect the quality of the work, students must submit in writing within 1 week a request for regrade with the following information:

  • Name and CIN number
  • Assignment #/Exam #/Project
  • Clearly articulated rationale with supporting data to backup your claim

We will review your request for a regrade and respond accordingly. Please note that as a matter of principle the entire assignment, exam, or project will be evaluated. This is because, more often than none, our grading methods are generous to begin with and that we always give students the benefit of doubt.

It is entirely possible that a regrade request can result in an overall lower score for the assignment or exam. With respect to the course project or the final grade, it is possible that students can actually fail the course. Please do the proper risk reward analysis.

In those cases where students feel that they should get more because ___. Where ___ includes (but not limited to):

  • They worked really hard on the assignment or project
  • They would have to endure shame with family and friends
  • They need XYZ grade to graduate

Please do not bother. We will not entertain such a request.

Late Policy

Students are permitted one 1-week late submission or demonstration on the programming assignments provided that they coordinate ahead of time with the instructors. After the second week, the grade for assignment will be 0.

The same provision does not extend to the course project. The course project demonstration will take place during the university's designated final exam day and will not be offered at any other time. Please plan your study schedule and travel arrangements accordingly.

Academic Integrity

Cheating will not be tolerated. Once adjudicated, all involved parties will receive a grade of F for the course and be reported to the Computer Science Department.

Definition of Cheating

In our view, cheating is a disease in academia that undermines the value of a quality education. What's worse, it is a selfish act that is ultimately injurious to the individual and the collective.

For this, let us obviate any ambiguity with a commonly accepted definition of cheating:

Cheating is claiming someone else's work to be your own. Cheating is not just receiving unauthorized assistance on an assignment, exam, or project but also providing the solution to others.

Note that we do not discourage discussion and collaboration. However, we are very adamant that you utilize the class's discussion board and office hours for this purpose.

Determination of a Cheating Incident

Over the years, we have developed a number of strategies and techniques to identify cheating. The biggest tell-tail sign of cheating is when students do not understand their solution and/or source code. Second, we have automated tools for source code analysis. To this extent, simply reformatting source code or renaming variables is pointless.

While we will listen to your explanation, once decided we will not change our determination. You can, however, be assured that we will treat students with dignity and respect throughout the process.

Adjudication Process

We will inform all involved parties in writing of our observation, along with evidence and our intention to administer the appropriate penalty. For this course, the only available penalty is an F as the final grade.

Each student then will have 1 week to respond individually and in writing with his or her side of the story.

Note that an apology, recognition of mistake, and/or promise to never do it again will not be considered and will most definitely NOT have an affect on the verdict. A non-response is considered an acceptance of responsibility.

We seek out a opinion from a non-interested third party reviewer, most likely another instructor. The student's identity will be anonymized when we make a request for a third party evaluation. The review package will only include:

  • Course syllabus
  • Accusation
  • Supporting Evidence
  • Student's response

The only question that we present to the third party reviewer will be did cheating take place? We will be very clear that we are not interested in gradation, severity, or level of cheating nor are we interested intent or rationale.

The bottomline: don't cheat, if you cheat, you would simply get an F for course. This is not negotiable. So please do the proper risk reward analysis.

Finally, we will inform all involved of our final decision. By definition, a final decision is final.

ADA statement

Reasonable accommodation will be provided to any student who is registered with the Office of Students with Disabilities and requests needed accommodation.

Course Schedule

The schedule below is tentative and is subject to change.

Week # [date] Topic Notes
1 [4/3] Introduction Form a team, set up environment
2 [4/10] Data Acquisition Homework 1 & Quiz 1
3 [4/17] Data Storage Install MongoDB
4 [4/24] Data Storage Install Elastic Search, docker Homework 2
5 [5/1] Data Analysis Elastic Search Query Quiz 2
6 [5/8] Data Analysis MongoDB MapReduce No class but remote video and notes
7 [5/15] Data Analysis Install Python Quiz 3
8 [5/22] Data Analysis Intro to Machine learning
9 [5/29] Data Visualization JavaScript and D3 Quiz 4 Homework 3
10 [6/5] Project Project demo Homework 4
Final [6/12] Project Ready for demo!

You have choice of presenting the project between week 10 (June 5) and week 11 (June 12). Please select your date accordingly.

This syllabus is subjected to change. In the event of an update, we will notify students and provide a rationale for the adjustment.