# From Social Science to what?  
Thank you for checking out the code for: 

> Hogan, Bernie (2022, forthcoming) _From Social Science to Data Science_. Sage Publications. 

This notebook contains the code from the book, along with the headers and additional author notes that are not in the book as a way to help navigate the code. You can run this notebook in a browser by clicking the buttons below. 
    
The version that is uploaded to GitHub should have all the results pasted, but the best way to follow along is to clear all outputs and then start afresh. To do this in Jupyter go the menu and select "Kernel -> Restart Kernel and Clear all Outputs...". To do this on Google Colab go to the menu and select "Edit -> Clear all outputs".
    
The most up-to-date version of this code can be found at https://www.github.com/berniehogan/fsstds 

Additional resources and teaching materials can be found on Sage's forthcoming website for this book. 

All code for the book and derivative code on the book's repository is released open source under the  MIT license. 
    

[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/berniehogan/fsstds/main?filepath=chapters%2FCh.01.Introduction.ipynb)[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/berniehogan/fsstds/blob/main/chapters/Ch.01.Introduction.ipynb)

# (PO)DIKW - A potential theoretical framework for Data Science 

## What is data? 

## From Data to Wisdom. 

# Beyond the interface 

# Fixed, variable, and marginal costs: Why not to build a barn.

## From Economics to Data Science 

In [1]:
# Attempt 1. High marginal costs, low fixed costs 
email1 = "user.example@mail.com"
email_parts = email1.split("@")
name1 = email_parts[0]

email2 = "generic.student@oii.ox.ac.uk"
email_parts = email2.split("@")
name2 = email_parts[0]

print(name1,name2)

user.example generic.student


In [2]:
# Attempt 2. Low marginal costs
email_list = ["user.example@mail.com",
              "generic.student@oii.ox.ac.uk",
              "dr.professor@oii.ox.ac.uk"]

print([x.split("@")[0] for x in email_list])

['user.example', 'generic.student', 'dr.professor']


## The challenges of maximising fixed costs 

# Code should be FREE

## Functioning code

In [3]:
def square(number):
    squarednumber = number * number  
    return squarednumber

print(square(3))

9


## Robust code

In [4]:
import numbers 

In [5]:
def square(number):
    # pre-emtively checking for inclusion
    if isinstance(number, numbers.Number):
        squarednumber = number * number  
        return squarednumber
    else:
        return float("NaN")

print(square("b"),square(2))

nan 4


In [6]:
def square(number):
    # duck typing to handle exclusion
    try:
        squarednumber = number * number  
        return squarednumber
    except:
        return float("NaN")

print(square("b"),square(2))

nan 4


## Elegant

In [7]:
def square(number):
    if isinstance(number, numbers.Number):
        return number * number
    return float("NaN")

print(square("b"),
      square(2))

nan 4


In [8]:
def to_exponent(number, power = 2):
    if isinstance(number, numbers.Number):
        return number ** power
    return float("NaN")

print(to_exponent("b"),
      to_exponent(2),
      to_exponent(3,3))

nan 4 27


## Efficient 

In [9]:
%%time

newlist = []

for i in range(1000): newlist.append(i)

CPU times: user 233 µs, sys: 12 µs, total: 245 µs
Wall time: 254 µs


In [10]:
newlist = [] 

%timeit -n 1000 for i in range(500): newlist.append(i)

%timeit -n 1000 newlist = [i for i in range(500)]

%timeit -n 1000 newlist = list(range(500))

31.2 µs ± 4.21 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
14.7 µs ± 159 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
5.6 µs ± 73.2 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


# Pseudocode (and pseudo-pseudocode)

## Attempt 1. Pseudocode as written word

## Attempt 2. Pseudocode as mathematical formula

## Attempt 3. Pseudocode as written code 

## Attempt 4. Slightly more formal pseudocode (in a Python style)

# Summary

# Further reading 

# Extensions and reflections 