# Introduction to Databases

From the [Columbia University Bulletin](http://bulletin.columbia.edu/columbia-college/departments-instruction/computer-science/#coursestext)

"The fundamentals of database design and application development using databases: entity-relationship modeling, logical design of relational databases, relational data definition and manipulation languages, SQL, query processing, transaction processing. Programming projects are required."


## About your instructor

- Academic experience
    - Ph.D. in Computer Science, Columbia University, 1989
    - Joined Columbia as full time _Professor of Professional Practice_, 01-Jan-2018
    - 8 semesters as an adjunct professor teaching Topics in Computer Science
        - Cloud Computing
        - Web and Internet Application Development
        - Web Application Servers and Applications
        - W4111 - Introduction to Databases


- 30 years industry experience
    - [IBM Fellow](https://en.wikipedia.org/wiki/IBM_Fellow), Chief Architect for [IBM Software Group](https://en.wikipedia.org/wiki/IBM_Software_Group_(SWG)
    - Microsoft Technical Fellow
    - Executive Vice President, Chief Technology Officer, [CA Technologies](https://www.ca.com/us.html)
    - Vice President, CTO, Senior Fellow, [Dell Software Group](https://en.wikipedia.org/wiki/Dell_Software)
    - Co-Founder and CTO, [Seeka TV](https://seekatv.com/)


- Publications
    - Approximately 60 technical publications.
    - Authored, co-authored several standards in web applications.
    - 12 patents.


- Personal and hobbies
    - Two amazing daughters (One is Barnard student. One is a sophomore in high school)
    - Interested in languages. Speak Spanish reasonably well and trying to learn Arabic.
    - Black Belt in Kenpo Karate
    - Amateur astronomy
    - Road bicycling
    - Officer in the New York Guard

<br>
<img src="http://www.donald-ferguson.net/blog/wp-content/uploads/2018/01/aboutme.jpeg">

## About this course

- This course is foundational, and will teach you the core concepts in
    - Data modeling
    - Data model implementation; Data manipulation.
    - Different database models and database management systems.
    - Implementation of data centric applications and systems.
    
    
- ANY non-trivial application
    - Requires a well-designed data model.
    - Implements a data model and manipulates data.
    - Uses a database management system.
    
    
- Understanding databases and database managements are core to the “hottest fields” in computer science, e.g.
    - Data science
    - Machine learning
    - Intelligent (utonomous) systems
    - Internet-of-Things
    - Cyber security 
    - Cloud Computing
    
   
- <span style="color:red"> __Personal perspective__
    - A large percent of my career has been spent figuring out or leading teams that figured out how to model, implement and manipulate data.
    - I have used the information in this class more than anything else I have learned.
    - This will likely be true for you.
</span>

## Organization and Logistics

- Lecture: 1010- 1125, Tuesday, Thursday, Havemeyer Hall Room 209

 
- Instructor: Donald F. Ferguson (dff@cs.columbia.edu)


- Instructor Office Hours: 
    - Tuesday, Thursday: 0900-1000
    - Tuesday, Thursday: 1130-1300
    - Location: 623 Shapiro/CEPSR
    
    
- Collaboration/Contact
    - The class is on [Piazza](piazza.com/columbia/spring2018/comsw4111_002_2018_1introductiontodatabases)
        - General questions
        - Clarification of homeworks, class material, etc.
    - Slack, for quick messages and questions.
        - Direct message to me, and
        - Please join slack and the channel [#w4111s18](https://join.slack.com/t/dff-columbia/shared_invite/enQtMjg0Mzk4MTQwMzQxLTZlNzk3OTZmNWE2NzNmNzViZmJlMWVmNWVlZmUxZTU5NjkwYjQ1YTdjMzA3ZTMzZDM3ZmIwYzAyYjIwYTNkZDI) for quick questions/comments to class.

    - Course Assistants
        - TBA
        - Will join Piazza and Slack channel, and announce office hours, contact info, etc.

## Assignments, Exams and Grades
- Point value of assignments and exams
    - 50%: Homework assignments
        - Approximately one HW every two weeks, for a total of 7 or 8.
        - Some will be slightly harder for extra points.
        - Mix of programming assignments and questions.
    - Exams: All exams are "take home exams."
        - Mix of programming assignments and questions.
        - 20% of grade is midterm exam score.
        - 30% is final exam score. 
    - Extra-credit
        - Class participation and office hour participation earns extra-credit points.
        - There will be extra-credit homework projects to enable making up points lost on homeworks or exams.


- Late submission
    - You have a total of 5 grace days to apply for all homeworks.
    - 1 minute past the due date counts for 1 day. 24 hours + 1 minute counts for two days.
    - You cannot use grace days for midterm, final exam or extra-credit assignments.
    - NOTE:
        - Respect for the individual is paramount. 
        - We will always accomodate illness, family emergencies, etc. 

## Environment and Material
- Course material
    - Textbook:
        - _Database Management Systems, 3rd Edition_, Ramakrishnan and Gehrke, ISBN: 978-0072465631
        - We will cover a subset of the material in the textbook, and in a different order.
        - I willbring in examples from industry, engineering and practical experience.
    - Lecture material and examples:
        - Will be Jupyter Notebooks (http://jupyter.org/)
        - Notebooks, slides, sample code, etc. will be available on [GitHub project](https://donald-f-ferguson.github.io/w4111-Databases/) for the course.
        
        
- Project and development
    - I will primarily use Python and/or JavaScript.
    - You can use the language of your choice, but my ability to help diminishes if you choose a language other than JavaScript, Python or Java.
    
    
- Development environment
    - Database engines and tools
        - We will start with the relational datamodel. Please install
            - [MySQL](https://dev.mysql.com/doc/refman/5.7/en/installing.html) and
            - [MySQL Workbench](https://dev.mysql.com/doc/workbench/en/wb-installing.html).
            - You can also use Sequel Pro on in place of SQL Workbench (on Mac).
        - We will also use [Neo4j](https://neo4j.com/) and Redis (https://redis.io/), but you do not need to install now.
    - [Integrated Development Environments](https://en.wikipedia.org/wiki/Integrated_development_environment)
        - The [JetBrains] tools are free for students, and useful for Python, JavaScript and Java.
        - [Eclipse](https://www.eclipse.org/) is an alternative.
        - I will use Ananconda for most of the lectures and examples. You can choose to install
            - Install Python 3
            - [Anaconda Community Distribution](https://www.anaconda.com/distribution)
            - Ananconda includes the [Spyder](https://github.com/spyder-ide/spyder) for Python, which is sufficient for the course.