# Final Project Template
**Notebook Preparation for Lesson**

Once you open the notebook:

1. Save in on your google drive (copy to drive) and name it DMAP FA21 Final Project
2. Share the notebook and copy the share ID to the NOTEBOOK_ID (and re-save the notebook)
3. This ID should remain the same for all your milestones
4. You will submit a copy of your updated notebook (this file) on Moodle for each milestone


In [None]:
# Keep this code cell here
# Project title will be used by the reviewer
PROJECT_TITLE = "DMAP FA21 Final Project"
NOTEBOOK_ID   = "https://colab.research.google.com/drive/19ZaG8MrPvBccE1PrLZ73gupKDtN4Lved?usp=sharing"
VERSION = "FA21"


---
# Project Introduction

Double click on this cell to see its contents.  We expect you to replace this cell.

<img align="left" src="http://drive.google.com/uc?export=view&id=1nA9491MchEtFcklvtIGqOnipE63C2FGD"/>

• Describe the **context**, sitution, problem to be solved/explored

• Whatever you need to get the reader _involved_

• Images can be hosted using google drive (you may need to create a transparent border)

• Even formulas when needed: 
$$e^x = \sum_{k=0}^{\infty} \frac{x^k}{k!}$$

• Note that markdown is very whitespace sensitive.
• Click on this cell to read comments.

<!-- this is a comment -->
<!-- 

   VIDEO INSTRUCTIONS (and data hosting)

1. upload to google drive, get the share URL
https://drive.google.com/file/d/1yGvY5a0KAqnOKf5kLh5EbbbRY4_LonAX

2. convert to export URL:
http://drive.google.com/uc?export=download&id=1yGvY5a0KAqnOKf5kLh5EbbbRY4_LonAX

3. OR use some other service to host your video:
https://storage.googleapis.com/uicourse/videos/dmap/Exact%20Instructions%20Challenge%20-%20THIS%20is%20why%20my%20kids%20hate%20me.%20%20Josh%20Darnit.mp4

replace the src="YOUR VIDEO URL" in the <source> tag in the next cell below
-->

In [None]:
%%html
<!-- this should be the ONLY html cell in the notebook: use markdown -->
<div style="font-size:36px; max-width:800px; font-family:Times, serif;">
 Title Your Video (and update the video URL)
<video width="600" controls>
  <source src="https://drive.google.com/uc?export=download&id=1yGvY5a0KAqnOKf5kLh5EbbbRY4_LonAX"
  type="video/mp4">
</video>
</div>
Note: If your video is too large, you can host it on Vimeo, YouTube, etc and paste the URL here

In [None]:
# add your imports here for your entire project
import pandas as pd
import numpy as np

#Project Introduction
A detailed description about the problem/ research question(s) you are addressing, goals/expected outcome of the project, a brief introduction to your data, and the techniques you will be using.

More than any period in history, we are dependent on the internet to live our everyday life and get our work done. We are surrounded by myriad information every day and it becomes increasingly hard to evaluate the quality and truthfulness of each piece of information as we digest them. Since how we see the world and others is heavily influenced by the information we received, it is likely for some people to form certain beliefs that are not supported by sufficient evidence. Conspiracist ideation – belief in conspiracy theories – is a typical example of such beliefs. People endorsing conspiracy theories tend to explain an event or situation by invoking a conspiracy by sinister and powerful groups, often political in motivation, when other explanations are more probable (Brotherton et al., 2013). A number of studies investigating conspiracy beliefs find that conspiracy beliefs do not result from rational evaluation of the evidence related to each specific conspiracist claim; rather there seems to be stable individual differences in the general tendency to engage with conspiracist explanations for events (e.g., Goertzel, 1994; Swami et al., 2010, 2011, 2013; Wood et al., 2012). Findings suggest that there are relationships between conspiracy beliefs and personality traits and cognitive styles, but they are limited in robustness and strength (see Swami et al., 2010, 2011, 2013; Swami and Furnham, 2012). If conspiracy beliefs and personality traits and/or cognitive styles are indeed related, it is plausible to predict one from the other. For this project, I would like to use machine learning approaches to try making prediction on conspiracy beliefs from personality traits. I will investigate the effects of personality traits and demographic information (e.g., age, gender, education, major), separately, on predicting conspiracy beliefs. The dataset I will use was collected by [Open Psychometrics](https://openpsychometrics.org/), through an interactive online version of the **Generic Conspiracist Belief Scale (GCBS)** (Brotherton et al., 2013) in 2016. It contains 2495 responses (rows) and 72 columns, comprised of 15 items from GCBS, time spent in answering the questions, 10 items from a brief measure of the Big Five Personality Domains (Gosling et al., 2003), 16 items for validity check, and a bunch of questions related to demographic information.  I intend to utilize both parametric (e.g., logistic regression) and nonparametric (e.g., decision trees) models, and compare their prediction performance. 

# Data Acquisition, Selection, Cleaning
Introduce the data here as well as any technical overview that's important that wasn't given in the introduction


In [None]:
# read data from google drive
url = 'https://drive.google.com/file/d/14ppYjXV4N9FNbcmSSLLc1AXq6Tffi7hP/view?usp=sharing'
doc_id = '14ppYjXV4N9FNbcmSSLLc1AXq6Tffi7hP'
base_url = 'https://drive.google.com/uc?id=' + doc_id
df = pd.read_csv(base_url)
df.head()
df.describe()

Unnamed: 0,Q1,Q2,Q3,Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14,Q15,E1,E2,E3,E4,E5,E6,E7,E8,E9,E10,E11,E12,E13,E14,E15,introelapse,testelapse,surveyelapse,TIPI1,TIPI2,TIPI3,TIPI4,TIPI5,TIPI6,TIPI7,TIPI8,TIPI9,TIPI10,VCL1,VCL2,VCL3,VCL4,VCL5,VCL6,VCL7,VCL8,VCL9,VCL10,VCL11,VCL12,VCL13,VCL14,VCL15,VCL16,education,urban,gender,engnat,age,hand,religion,orientation,race,voted,married,familysize
count,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0,2495.0
mean,3.472545,2.963527,2.046894,2.636072,3.254108,3.108617,2.666934,2.450501,2.232866,3.502204,3.265331,2.64489,2.103006,2.955912,4.227655,44419.9,51506.46,6663.946693,22867.94,7548.478557,8060.431663,8209.945,4762.673747,6599.963126,9222.47976,8850.744289,11173.0,6547.332665,7660.725451,7665.187976,850.004008,288.237675,298.517034,3.483367,4.373547,4.777555,4.333467,5.63006,5.024048,4.967936,4.030461,4.364329,2.537475,0.972345,0.941884,0.549098,0.975551,0.952705,0.104609,0.272545,0.453307,0.059319,0.967134,0.313427,0.167134,0.788377,0.926253,0.970741,0.982766,2.323046,2.119439,1.557515,1.242084,43.365531,1.182365,4.180361,1.662525,3.844489,1.662926,1.262124,2.560321
std,1.455552,1.494669,1.387236,1.451371,1.471855,1.506676,1.509954,1.569256,1.419266,1.388713,1.400302,1.504787,1.382461,1.489222,1.104538,1354595.0,1614006.0,10890.202483,656245.9,9892.172182,9338.071912,36656.93,7820.863912,11945.74574,18519.041061,15473.033589,100152.0,6090.033961,7594.379339,23886.001809,8176.969673,3452.49299,2964.644784,1.986629,1.863189,1.763883,2.043292,1.414999,1.834639,1.75681,1.919562,1.933313,1.64566,0.164016,0.23401,0.497683,0.154469,0.212311,0.306111,0.445358,0.497915,0.236267,0.178321,0.463979,0.37317,0.408541,0.261412,0.168564,0.13017,0.947208,0.762074,0.596534,0.435854,684.593489,0.496186,3.697397,1.170653,0.924942,0.491924,0.559446,2.355926
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,415.0,0.0,0.0,0.0,0.0,0.0,1.0,3.0,6.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,2.0,2.0,1.0,1.0,2.0,2.0,1.0,1.0,1.0,2.0,2.0,1.0,1.0,1.0,4.0,5912.5,5779.5,3524.0,4087.5,4167.5,4497.5,4233.0,2580.5,3473.5,4794.0,4695.5,4835.5,3789.0,4439.5,3782.5,3.0,81.0,105.0,2.0,3.0,3.0,2.0,5.0,4.0,4.0,2.0,3.0,1.0,1.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,2.0,2.0,1.0,1.0,18.0,1.0,1.0,1.0,4.0,1.0,1.0,2.0
50%,4.0,3.0,1.0,2.0,4.0,3.0,2.0,2.0,2.0,4.0,4.0,2.0,1.0,3.0,5.0,8124.0,8161.0,4858.0,5666.0,5783.0,6286.0,5856.0,3529.0,5005.0,6637.0,6370.0,6710.0,5468.0,6096.0,5155.0,8.0,107.0,137.0,3.0,5.0,5.0,5.0,6.0,5.0,5.0,5.0,5.0,2.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,2.0,2.0,2.0,1.0,22.0,1.0,2.0,1.0,4.0,2.0,1.0,2.0
75%,5.0,4.0,3.0,4.0,5.0,4.0,4.0,4.0,3.0,5.0,4.0,4.0,3.0,4.0,5.0,12396.0,11685.5,7020.5,8164.5,8225.5,8974.0,8108.5,5137.5,7219.0,9493.0,8951.0,9621.5,7532.0,8600.0,7373.0,40.0,141.5,188.0,5.0,6.0,6.0,6.0,7.0,7.0,6.0,5.0,6.0,3.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,3.0,3.0,2.0,1.0,35.0,1.0,6.0,2.0,4.0,2.0,1.0,3.0
max,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,67558860.0,77868020.0,281827.0,32021350.0,227606.0,211752.0,1342932.0,257289.0,454068.0,653298.0,412550.0,4946876.0,119049.0,188979.0,836054.0,198370.0,108420.0,102231.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,4.0,3.0,3.0,2.0,33769.0,3.0,12.0,5.0,5.0,2.0,3.0,98.0


# Data Exploration
Initial exploration of the dataset, not required, but useful to give the reader a 'view' of the data

# Data Analysis

# Summary


---
# Submission Guidelines (keep this section here)
---


When you are ready to submit your project, part of the submission process will be to register your notebook for reviewing.  

For each milestone, you will submit an updated version of your project notebook (this notebook) with that milestone’s requirements. Your project notebook ID and URL should be the same.

You will also receive the links and instructions to do the peer reviews.

**Submission for Milestone0:**
1. Save a copy of this notebook in your Google drive
3. Share the Noetbook and place the Notebook ID in the first code cell
4. Download as "DMAP_FA21_Final_Project_MS0.ipynb"
5. Submit this in Moodle
