# 1. Load SQLite and Database
- 'CourseData.db'
- note the 3 backslashes

In [1]:
%reload_ext sql
%sql sqlite:///CourseData.db

'Connected: @CourseData.db'

In [2]:
%%sql
DROP TABLE IF EXISTS CourseData;

 * sqlite:///CourseData.db
Done.


[]

# safety precautions
- drop tables to start from scratch

In [3]:
%%sql
sqlite:///CourseData.db
DROP TABLE IF EXISTS COURSECATALOGS;
DROP TABLE IF EXISTS COURSES;
DROP TABLE IF EXISTS TEACHERS;
DROP TABLE IF EXISTS COURSEMEETINGS;

Done.
Done.
Done.
Done.


[]

# 2. Do COUNT statements for 3 .csv Files
- ensure .csv name spelling is exact

In [4]:
%%sql
SELECT Count(*) FROM import_course_meetings;

 * sqlite:///CourseData.db
Done.


Count(*)
317339


In [5]:
%%sql
SELECT Count(*) FROM import_courses;

 * sqlite:///CourseData.db
Done.


Count(*)
15955


In [6]:
%%sql
SELECT catalog_id FROM import_Course_Catalog limit 1;


 * sqlite:///CourseData.db
Done.


catalog_id
AN 0301


In [7]:
%%sql
SELECT Count(*) FROM import_Course_Catalog;

 * sqlite:///CourseData.db
Done.


Count(*)
4441


# 3. Create Tables
- primary key is generated on every table by naming new surrogate key/field that isn't already in .csv files.
- don't need NOT NULL for PK because it's implied in SQLite.


In [8]:
%%sql
CREATE TABLE COURSECATALOGS (
    CCID INTEGER PRIMARY KEY,
    catalog_id TEXT,
    program_code TEXT,    
    program_name TEXT,
    course_title TEXT,
    prereqs TEXT,    
    coreqs TEXT,
    fees TEXT,
    attributes TEXT,    
    description TEXT
);

 * sqlite:///CourseData.db
Done.


[]

In [9]:
%%sql
CREATE TABLE TEACHERS (
 TID INTEGER PRIMARY KEY,
 primary_instructor TEXT NOT NULL
);

 * sqlite:///CourseData.db
Done.


[]

In [10]:
%%sql
CREATE TABLE COURSES (
                 CID INTEGER PRIMARY KEY,
                 catalog_id TEXT,
                 crn INTEGER,
                 term TEXT,
                 section TEXT,
                 credits TEXT,
                 title TEXT,
                 meetings TEXT,
                 timecodes TEXT,
                 primary_instructor TEXT,
                 cap TEXT,
                 act TEXT,
                 rem TEXT,
                 CCID INTEGER,
                 TID INTEGER,
                 CMID INTEGER,
                 FOREIGN KEY(CCID) REFERENCES COURSECATALOGS(CCID)
                 FOREIGN KEY (TID) REFERENCES COURSES(TID)
                 FOREIGN KEY (CMID) REFERENCES COURSEMEETINGS(CMID)
);


 * sqlite:///CourseData.db
Done.


[]

In [26]:
%%sql
CREATE TABLE COURSEMEETINGS (
    CMID INTEGER PRIMARY KEY,
    term TEXT,
    crn TEXT,
    location TEXT NOT NULL,	
    day TEXT NOT NULL,
    start TEXT NOT NULL,
    end TEXT NOT NULL
);

 * sqlite:///CourseData.db
Done.


[]

# 4. Do INSERT Statements
- do not include generated PKs
- *reasoning* - PKs aren't included in the imported tables, so we can't insert them from there.
- implied (since all PK/FKs are generated) that  PK/FKs are distinct.

In [27]:
%%sql
INSERT INTO COURSECATALOGS (
    catalog_id,
    program_code,  
    program_name,
    course_title,
    prereqs,    
    coreqs,
    fees,
    attributes,    
    description) 
SELECT DISTINCT 
    catalog_id,
    program_code,  
    program_name,
    course_title,
    prereqs,    
    coreqs,
    fees,
    attributes,    
    description
    FROM import_Course_Catalog;

 * sqlite:///CourseData.db
2221 rows affected.


[]

In [28]:
%%sql
SELECT * FROM COURSECATALOGS
limit 20;

 * sqlite:///CourseData.db
Done.


CCID,catalog_id,program_code,program_name,course_title,prereqs,coreqs,fees,attributes,description
1,AN 0301,AN,Asian Studies,Independent Study,,,,,Students undertake an individualized program of study in consultation with a director from the Asian studies faculty.
2,AN 0310,AN,Asian Studies,Asian Studies Seminar,,,,,"This seminar examines selected topics concerning Asia. This course is taught in conjunction with another 100-300 level course from a rotation of course offerings. Consult the Asian Studies director to identify the conjoined course for a given semester. The seminar concentrates on topics within the parameters of the conjoined course syllabus but adds research emphasis. Students registered for this course must complete a research project, to include 300-level research, in addition to the regular research requirements of the conjoined course, and a 25-50 page term paper in substitution of some portion of the conjoined course requirements, as determined by the instructor. Open to juniors and seniors only."
3,BU 0211,BU,Business,Legal Environment of Business,Junior standing.,,,,"This course examines the broad philosophical as well as practical nature and function of the legal system, and introduces students to the legal and social responsibilities of business. The course includes an introduction to the legal system, the federal courts, Constitutional law, the United States Supreme Court, the civil process, and regulatory areas such as employment discrimination, protection of the environment, and corporate governance and securities markets."
4,BU 0220,BU,Business,Environmental Law and Policy,,,,"EVME Environmental Studies Major Elective, EVPE Environmental Studies Elective, EVSS Environmental Studies: Social Science, MGEL Management: General Elective","This course surveys issues arising out of federal laws designed to protect the environment and manage resources. It considers in detail the role of the Environmental Protection Agency in the enforcement of environmental policies arising out of such laws as the National Environmental Policy Act, the Clean Water Act, and the Clear Air Act, among others. The course also considers the impact of Congress, political parties, bureaucracy, and interest groups in shaping environmental policy, giving special attention to the impact of environmental regulation on business and private property rights."
5,BU 0311,BU,Business,"The Law of Contracts, Sales, and Property",BU 0211.,,,,"This course examines the components of common law contracts including the concepts of offer and acceptance, consideration, capacity and legality, assignment of rights and delegation of duties, as well as discharge of contracts. The course covers Articles 2 and 2A of the Uniform Commercial Code relating to leases, sales of goods, and warranties. The course also considers personal and real property, and bailments."
6,BU 0312,BU,Business,The Law of Business Organizations and Financial Transactions,BU 0211.,,,,"This course offers an analysis of legal principles related to the law of agency, sole proprietorships, partnerships, corporations, limited liability companies, and other business forms. The second half of the course addresses several sections of the Uniform Commercial Code, such as negotiable instruments, bank collections and deposits and secured transactions. Finally, the course examines the law of suretyship, debtor-creditor relationships, and bankruptcy."
7,BU 0320,BU,Business,Employment Law and Discrimination in the Workplace,,,,"MGEL Management: General Elective, UDIV U.S. Diversity","This course examines a variety of legal issues related to the workplace including the doctrine of employment at will, employee privacy, and the history and development of labor unions and the legal protections afforded by the National Labor Relations Act. A study of the role of the Civil Rights Act of 1964 and the Equal Employment Opportunity Commission in eradicating discrimination based on race, sex, religion, national origin, age, and disability occupies a major portion of the course. Other employment issues include affirmative action, worker safety, and compensation."
8,BU 0391,BU,Business,Seminar in Business Law and Ethics,"AE 0291, BU 0211, two additional courses in law or applied ethics.",,,,This interdisciplinary study of these two aspects of the business environment is cross-listed as
9,BL 0101,BL,Black Studies,Black Lives Matter,,,,"ASGW American Studies: Gateway, BSFC Black Studies Focus Course, BSSS Black Studies: Social and Behavioral Sciences, PJST Peace and Justice Studies, UDIV U.S. Diversity","In the context of Ferguson, Charleston, and other national crises, this course responds to the call of students from our campus community to raise questions about and critically reflect upon the failures of democracy to recognize the value of Black Life. This course employs collective thinking, teaching, and research to focus on questions surrounding race, structural inequality, and violence. It examines the historical, geographical, cultural, social, and political ways in which race has been configured and deployed in the United States. Various faculty will bring to bear their respective scholarly lenses so that students understand race and racism across intellectual disciplines."
10,BL 0398,BL,Black Studies,Independent Study,,,,BSCP Black Studies Capstone Course,"Upon request and by agreement with a professor in the program, a Black Studies minor may conduct a one-semester independent study on a defined research topic or field of study."


In [29]:
%%sql
INSERT INTO TEACHERS (
    primary_instructor
)
SELECT DISTINCT
    primary_instructor
    FROM import_courses
    ;

 * sqlite:///CourseData.db
1105 rows affected.


[]

In [15]:
%%sql
SELECT * FROM TEACHERS
limit 5;

 * sqlite:///CourseData.db
Done.


TID,primary_instructor
1,Michael P. Coyne
2,Rebecca I. Bloch
3,Paul Caster
4,Jo Ann Drusbosky
5,Arleen N. Kardos


In [30]:
%%sql 
INSERT INTO COURSEMEETINGS (
    term,
    crn,
    location,	
    day,
    start,
    end)
 SELECT DISTINCT
    crn,
    term,
    location,	
    day,
    start,
    end
    FROM import_course_meetings
    ;

 * sqlite:///CourseData.db
311142 rows affected.


[]

In [42]:
%%sql
INSERT INTO COURSES (
                 TID, 
                 CCID,
                 CMID,
                 primary_instructor,
                 catalog_id,
                 crn,
                 term,
                 section,
                 credits,
                 title,
                 meetings,
                 timecodes,
                 cap,
                 act,
                 rem
)
 SELECT DISTINCT
                TEACHERS.TID,
                COURSECATALOGS.CCID,
                COURSEMEETINGS.CMID,
                import_courses.primary_instructor,
                import_courses.catalog_id,
                import_courses.crn,
                import_courses.term,
                section,
                credits,
                 title,
                meetings,
                 timecodes,
                cap,
                act,
                 rem
                FROM import_courses
                JOIN TEACHERS ON TEACHERS.primary_instructor = import_courses.primary_instructor
                JOIN COURSECATALOGS ON COURSECATALOGS.catalog_id  = import_courses.catalog_id
                JOIN COURSEMEETINGS ON COURSEMEETINGS.crn = import_courses.crn
                ;


 * sqlite:///CourseData.db
0 rows affected.


[]

In [40]:
%%sql
SELECT * FROM COURSES
limit 5;

 * sqlite:///CourseData.db
Done.


CID,catalog_id,crn,term,section,credits,title,meetings,timecodes,primary_instructor,cap,act,rem,CCID,TID,CMID


In [19]:
%%sql
SELECT TID FROM TEACHERS
limit 5;

 * sqlite:///CourseData.db
Done.


TID
1
2
3
4
5


# 5. Run SELECT Queries

In [20]:
%reload_ext sql
%sql sqlite:///CourseData.db

'Connected: @CourseData.db'

## SELECT Query for all unique classrooms 
- only included those with a character lenght of 7 (some had 2, 8, etc.)
- can't figure out how to get a distinct location when including additinonal fields.

In [21]:
%%sql
SELECT DISTINCT 
location
FROM COURSEMEETINGS
WHERE LENGTH(location) = 7
ORDER BY location
LIMIT 20
;

 * sqlite:///CourseData.db
(sqlite3.OperationalError) no such table: COURSEMEETINGS [SQL: 'SELECT DISTINCT \nlocation\nFROM COURSEMEETINGS\nWHERE LENGTH(location) = 7\nORDER BY location\nLIMIT 20\n;'] (Background on this error at: http://sqlalche.me/e/e3q8)


## SELECT Query displaying all courses in MSBA Program '18-'19
- still have to get rid of blank by making defining description as NOT NULL.
- is there an easier way?

In [22]:
%%sql
SELECT course_title AS Course, program_name AS Program, catalog_id AS Code, description AS Description
FROM COURSECATALOGS
WHERE program_name = 'Information Systems'
AND
Code LIKE "IS 05%"
ORDER BY program_name
LIMIT 50;

 * sqlite:///CourseData.db
Done.


Course,Program,Code,Description
Information Systems and Database Management,Information Systems,IS 0500,"This course introduces the basic concepts and tools relevant to information systems and database management, and their enabling roles in business strategies and operations. Case studies are used to facilitate discussions of practical applications and issues involving strategic alignments of organizations, resource allocation, integration, planning, and analysis of cost, benefit and performance in light of the big data challenges. Specific emphases involve database design and implementation and emerging strategies and technologies such as business intelligence, big data management, web security, and online business analytics."
International Information Systems,Information Systems,IS 0501,"This course examines information technology environments around the world, and attendant challenges to business strategy and information systems design. The course identifies geographic and institutional variables that create borders in the global Internet economy: material infrastructures, socio-economic elements, and political-legal systems. The course emphasizes national and regional strategies, emergent technologies, hybrid systems, and equity issues."
Python for Business Analytics,Information Systems,IS 0505,"In this course, we introduce Python as a language and tool for collecting, preprocessing, and visualizing data for business analytics. since Python is one of the most popular programming languages, along with R, in data mining and business analytics, its fundamental programming logic and knowledge is essential for students to apply in data mining and to succeed in the job market. Specifically, this course focuses on the data-engineering phase, which includes collecting, preprocessing, and visualizing data, with respect to applications in business modeling, optimization, and statistical analysis. In addition, a number of mini projects will be used as vehicles to cover the main applications of data analytics, including recommender systems, text analytics, and web analytics."
Databases for Business Analytics,Information Systems,IS 0510,"This course introduces databases and data management in three parts. The first part covers basic database fundamentals. The second part is a hands-on introduction to Structured Query Language (SQL) for defining, manipulating, accessing, and managing data, accompanied by the basics of data modeling and normalization needed to ensure data integrity. The course concludes with a comprehensive database project that gives each student the opportunity to integrate and apply the new knowledge and skills learned from this class. Advanced topics such as distributed database systems, data services, and NoSQL databases are also discussed."
Project Management,Information Systems,IS 0520,"This course explores the process and practice of project management. Topics to be covered include project lifecycle and organizations, teambuilding and productivity, task scheduling and resource allocation, and progress tracking and control. Cases will be used to consider the implications for change management, consulting, IT implementation, and other related disciplines. Small team projects and experiential exercises will also be used to provide an active learning environment. This course is designed to count toward professional project management certification."
Data Mining and Business Intelligence,Information Systems,IS 0540,"This course will change the way you think about data and its role in business. Businesses, governments, and individuals create massive collections of data as a byproduct of their activity. Increasingly, managers rely on intelligent technology to systematically analyze data to improve their decision-making. In many cases, automating analytical and decision-making processes is necessary because of the large volume of data and the speed with which new data are generated. In this course, we will examine how data analysis technologies can be used to improve managerial decision making. We will study the fundamental principles and techniques of data mining through real-world examples and cases to place data mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application of these techniques is as much an art as it is a science. In addition, we will work ""hands-on"" with contemporary data mining software."
Business Analytics and Big Data Management,Information Systems,IS 0550,"This course will survey state-of-the-art topics in Big Data, looking at data collection (via smartphones, sensors, the Web), data storage and processing (scalable relational databases, Hadoop, Spark, etc.), extracting structured data from unstructured databases, systems issues (exploiting multicore, security), analytics (machine learning, data compression, efficient algorithms), data visualization, and a range of applications. Each of these five modules will introduce broad concepts as well as provide the most recent developments in the area."
Contemporary Topics in Information Systems and Operations Management,Information Systems,IS 0585,"This course draws from current literature and practice on information systems and/or operations management. The topics change from semester to semester, depending on student and faculty interest and may include: project management, e-business, management science with spreadsheets, e-procurement, executive information systems, ethics, and other socio-economic factors in the use of information technology."
Contemporary Topics: Advanced Data Mining Applications,Information Systems,IS 0585B,
Independent Study in Information Systems and Operations Management,Information Systems,IS 0598,This course provides an opportunity for students to complete a project or perform research under the direction of an Information Systems and Operations Management (ISOM) faculty member who has expertise in the topic being investigated. Students are expected to complete a significant project or research paper as the primary requirement of this course. Enrollment by permission of the ISOM Department Chair only.


# JOIN Testing

In [23]:
%%sql
SELECT *
FROM COURSES
LIMIT 10;

 * sqlite:///CourseData.db
Done.


CID,catalog_id,crn,term,section,credits,title,meetings,timecodes,primary_instructor,cap,act,rem,CCID,TID,CMID


In [24]:
%%sql
SELECT t.primary_instructor, c.catalog_id
FROM COURSES as c
JOIN TEACHERS as t ON c.TID = t.TI;

 * sqlite:///CourseData.db
(sqlite3.OperationalError) no such column: t.TI [SQL: 'SELECT t.primary_instructor, c.catalog_id\nFROM COURSES as c\nJOIN TEACHERS as t ON c.TID = t.TI;'] (Background on this error at: http://sqlalche.me/e/e3q8)


In [25]:
%%sql
SELECT TID, primary_instructor FROM TEACHERS
limit 10;


 * sqlite:///CourseData.db
Done.


TID,primary_instructor
1,Michael P. Coyne
2,Rebecca I. Bloch
3,Paul Caster
4,Jo Ann Drusbosky
5,Arleen N. Kardos
6,Scott M Brenner
7,Kevin C. Cassidy
8,Bruce Bradford
9,Milo W. Peck
10,Stephen E. Yost


Have to finish insterting COURSES column

In [43]:
%%sql
VACUUM;

 * sqlite:///CourseData.db
Done.


[]