# CSCS530 Midterm

#### Ruoyan Sun
#### April 27, 2015



## Smoking Model Outline

###Goal
We will examine individual smoker's decision to quit smoking and how their decisions influence others via socia interactions. 

### Justification
Due to the heterogeneity of individual smoker in the model, we use Agent Based Modeling (ABM) to model each individual as an agent and study the agent's decision making process. In addition, given the social contagion of smoking, we will construct social networks in the model to mimic social interactions and use network analysis to investigate how smoking cessation spreads on a network. Thus ABM and social network analysis are necessary in this study. ODE models cannot adequately address the heterogeneity nor the social network structure of the population. 

###Outline
-implement smoking prevention programs

-implement smoking treatment programs

-observe natural progress of smoking population without interventions, pay attention to whether smokers/non-smokers form clusters

-compare effectiveness of intervention programs by observing the number of smokers in the network and corresponding social network structure

-Observe smoking prevalence as well as smoking cessation rate at each time step

We then break the model down into serveral pieces to better describe it: 

###I. Space
In the model, we have a social network space that follows a power-law distribution at the baseline. Then a randomly selected 10% of the links rewind at each time period. Agent will be assigned index to better track their friendships in the network. 

###II. Actors
####A. People

In this model, people are individuals who are smokers. And they have the following properties: 

####Properties
1. age: how old is the person? We only look at adults between 21 and 64 years old. 
2. gender: what is the gender of the person? Male or female.
3. ethnicity: what is the person's ethnicity? Here we use white, black and other to simplify the model. 
4. socal_class: what is the social class of the person? We use a binary variable, low vs high-income. 
5. education: how much education has the person received? We use 3 categories: below hight school, high school graduates, college or higher. 
6. is_smoker: is this person a smoker? 
7. smoking_cessation: what is the probability that a perosn will quit smoking? The value of this variable is calculated as a function of variables from 1 to 5 listed above. 
8. success_rate: what is the probability that the person will quit smoking successfully? We will set this value to be the same across population. 
9. prob_smoking: what is the probability that the person will start smoking if non-smoker? A randomly assigned value from a uniform distribution. 
10. id: what is the id number of the agent? Each agent will get a unique index number from 1 to 10,000.

For their step function, agent will perform the following:
1. evaluate the probability of smoking cessation, then draw a random number from 0 and 1, if the random number is bigger than the probabiliy, then might quit smoking. Otherwise will stay as a smoker at this time period. 
2. based on success_rate, we randomly pick a certain porportion of the population to quit smoking successfully. These people will stay as non-smoker at this time period. 
3. evaluate the number of friends they have at this time period. 

####B. Network
A social network exists within the population with the following properties: 

#### Properties
1. size: how big is the network? The number of nodes is the same as the size of the population, which is set to be 10,000. 
2. type: what is the type of the network? Here we set it to be a scale-free network. 

For their step function, network sill perform the following:
1. evaluate the number of edges each node has. 
2. pick 10% of the edges randomly and rewind them. For the rewinding process, half of the edges will rewind randomly while the other half will rewind to connect only agents with same ethnicity and social status. 

###III. Initial Conditions
#### A. People
- people will be randomly distributed throughout the network using a uniform distribution with replacement. And the corresponding characteristics will also be randomly assigned to them using uniform distribution with replacement. 
- people have their own probability of stop smoking as a function of their characteristics and social network. 

#### B. Space
- A power-law network will be constructed at the initial time. Then for every time step, a randomly selected 10% of the edges will rewind. The edge selected will be deleted. Then two nodes (index for agents) will be picked and an edge will be constrcuted connecting these two nodes. 

### IV. Model Parameters
Based on the description above, we will have three different models: the baseline model, prevention model and treatment model. 

### V. Sweep Values
I plan to change values of smoking cessation success rate in the model. I think all the outcome vectors will be affected by the sweep, including the number of smokers, number of non-smokers and smoking cessation rate at each time step. 
I also want to test how network structures influence smoking. But I am not sure if I will have enough time for this. 

In [1]:
% matplotlib inline 

#standard imports
import copy
import itertools

#import scientific tools
import numpy
import matplotlib.pyplot as plt
import networkx
import pandas
import seaborn; seaborn.set()

#import widget
from IPython.html.widgets import *



###Person Class
Below, we define our person class. It contains constructor as well as several functions: 
- constructor: class constructor, which initilizes/creates the person we call Person(). This is in the __init__ method. 
- decide_smoking: decide if a person will smoke by comparing probabilities. 
- decide_cessation: decide if a person will stop smoking by comparing probabilities. 

In [10]:
class Person(object):
    """
    Person class, contains behavior of a person. 
    """
    def __init__(self, model, person_id, gender, ethnicity, social_class,education,smoking_cessation,
                 is_smoker=False, success_rate=0.8,prob_smoking=0.2):
        """
        constructor for person class. By default, 
        non smoker
        smoking cessation success rate is 0.8
        smoking initiation rate is 20% for individuals who are not smokers
        """
        #set model link and ID
        self.model=model
        self.person_id=person_id
        
        #set person parameters
        self.age=age
        self.gender=gender
        self.ethnicity=ethnicity
        self.social_class=social_class
        self.education=education
        self.is_smoker=is_smoker
        self.smoking_cessation=smoking_cessation
        self.success_rate=success_rate
        self.prob_smoking=prob_smoking
        
        
    def decide_age(self):
        """
        decide age. Only looked at age group from 20 to 64 and divided into 5-year age groups. Then take the average
        age of each age group and used Census data in 2013 to calculate the weight for each group."""
        age = 0
        a = numpy.random.uniform(0,100)
        if a <= 12.2:
            age = 22
        elif (a > 12.2) and (a<=24):
            age = 27
        elif (a > 24) and (a<=35.3):
            age = 32
        elif (a > 35.3) and (a<=46.6):
            age = 37
        elif (a > 46.6) and (a<=58.4):
            age = 42
        elif (a > 58.4) and (a<=71.2):
            age = 47
        elif (a > 71.2) and (a<=83.7):
            age = 52
        elif (a > 83.7) and (a<=94.8):
            age = 57
        else:
            age = 62
        return age

    def decide_gender(self):
        """
        decide gender, if random probability > 0.5, then male(false). otherwise female(true).
        """
        if numpy.random.uniform(0,1)>0.5:
            return False
        else:
            return True
        
    def decide_social_class(self):
        """
        decide social class, if random probability <= 0.5, low social class (false). otherwise high social class(true).  
        """
        if numpy.random.uniform(0,1)>0.5:
            return True
        else:
            return False
        
    def decide_education(self):
        """
        From Census data in 2000, we calculate that people attained under high school to be 19.6% of the adult population,
        those attained high school and some college (no degree) count as 56%, bachelor or above is 24.4%. 
        """
        education=0
        edu= numpy.random.uniform(0,1)
        if edu <= 0.196:
            education = 1
        elif (edu > 0.196) and (edu <= 0.756):
            education = 2
        else:
            education = 3
        return education
        
    def decide_race(self):
        """
        randomly assign white, black and other. Based on demographic info in the US in 2000, 
        72.4% white, 12.6% black and 15% other
        """
        race = 0
        r = numpy.random.uniform(0,1)
        if r <= 0.724:
            race = 1
        elif (r > 0.724) and (r <= 85):
            race = 2
        else:
            race = 3
        return race
    
    def decide_smoking(self):
        """
        decide if a nonsmoker will become a smoker using probabiliy of smoking
        """
        if self.is_smoker==False:
            if numpy.random.random()>=self.prob_smoking:
                return True
            else: 
                return False
            
    def decide_cessation(self): 
        """
        decide if a smoker will successfuly quit smoking
        """
        if self.is_smoker==True:
            if numpy.random.random()>= self.smoking_cessation:
                if numpy.random.random()>=0.2:
                    return True
                else:
                    return False
    

In [None]:
class Model(object):
    
    def __init__(self, num_people):
        
        self.num_people=num_people
        self.setup_people()
        
        def setup_people(self):
            for i in xrange(self.num_people):
                self.people.append()

In [12]:
individual=Person(1,22,False,2,False,1,True,False,0.8,0.2)
print individual

<__main__.Person object at 0x104897d10>


###Network Class
Below we have our network class. 

In [4]:
class Network(object):
    """
    Construct the network
    """
    
    def __init__(self, network_node, network_edge):
        self.network_node=network_node
        self.network_edge=network_edge

###Model Class
We define our baseline model class below. This has several parts:
- constructor: class constructor. Initializes/creates the model we call model()
- setup_network: method to setup the network space
- setup_people: method to create people
- step: main step method to control each time simulation

In [5]:
class Model(object):
    """
    Model class, which encapsulate the entire behavior of a single run.
    """
    def __init__(self, num_people=10000,):
        """
        class constructor
        """
        
        #set our model paramters: 
        self.num_people=num_people
        
        #set state variables
        self.t=0
        self.people = []
        self.num_smoker = 0.2*10000
        self.num_nonsmoker = 0.8*10000
       
        
        #set history variables
        self.history.network = []
        self.history_num_smoker=[]
        self.history_num_nonsmoker=[]
        
        #call step function to initialize people and network
        self.setup_people()
        self.setup_network()
    
    def setup_people(self):
        """
        method to set up people
        """
        pass 
    
    def setup_network(self):
        """
        method to set up network
        """
        self.network = networkx.random_lobster(10000,0.9,0.9)
    
    def prevention(self):
        """
        implement a prevention program
        """
        pass
    
    def treatment(self):
        """
        implement a treatment program
        """
        

###Overview of results & Hypothese

The results I expect to see from the model is how smoking spread over the network. The measurement will be number of people who are smokers/non-smokers at each time step. We can calculate the proportion of these two groups as well as get a list of who are those people from their id. We also implement two different intervention programs to see how the results will be improved or not. 

My hypothese are that social network structure and smoking cessation success rate play important roles in how smoking spread. By changing the value of these two factors, I expect to see a whole range of results. For example, the effects of two interventions might be closely related to the value of smoking cessation success rate. 
