<h1>Week 8 Homework </h1>
<h3>Exercise 9.1</h3>
As sample size increases, the power of a hypothesis test increases, which means it is more likely to be positive if the effect is real.
Conversely, as sample size decreases, the test is less likely to be positive even if the effect is real.
<br/>To investigate this behavior, run the tests in this chapter with different sub-sets of the NSFG data. You can use thinkstats2.SampleRows to select a random subset of the rows in a DataFrame.
<br/>What happens to the p-values of these tests as sample size decreases? What is the smallest sample size that yields a positive test?


In [1]:
import nsfg
import random
import numpy as np
import thinkstats2

random.seed(18)
np.random.seed(18)

In [2]:
class DiffMeansPermute(thinkstats2.HypothesisTest):
    """Tests a difference in means by permutation.
        class is extended from HypothesisTest that alreay calculating 
        actual hypthosis value in constructor that value is used in 
        PValue function to test if null hypthosis is true or not
    """

    def TestStatistic(self, data):
        """Computes the test statistic.

        data: data in whatever form is relevant        
        """
        group1, group2 = data
        test_stat = abs(group1.mean() - group2.mean())
        return test_stat

    def MakeModel(self):
        """Build a model of the null hypothesis.
        """
        group1, group2 = self.data
        self.n, self.m = len(group1), len(group2)
        self.pool = np.hstack((group1, group2))

    def RunModel(self):
        """Run the model of the null hypothesis.

        returns: simulated data
        """
        np.random.shuffle(self.pool)
        data = self.pool[:self.n], self.pool[self.n:]
        return data

In [3]:
class CorrelationPermute(thinkstats2.HypothesisTest):
    """Tests correlations by permutation."""

    def TestStatistic(self, data):
        """Computes the test statistic.

        data: tuple of xs and ys
        """
        xs, ys = data
        test_stat = abs(thinkstats2.Corr(xs, ys))
        return test_stat

    def RunModel(self):
        """Run the model of the null hypothesis.

        returns: simulated data
        """
        xs, ys = self.data
        xs = np.random.permutation(xs)
        return xs, ys

In [4]:
class PregLengthTest(thinkstats2.HypothesisTest):
    """Tests difference in pregnancy length using a chi-squared statistic."""

    def TestStatistic(self, data):
        """Computes the test statistic.

        data: pair of lists of pregnancy lengths
        """
        firsts, others = data
        stat = self.ChiSquared(firsts) + self.ChiSquared(others)
        return stat

    def ChiSquared(self, lengths):
        """Computes the chi-squared statistic.
        
        lengths: sequence of lengths

        returns: float
        """
        hist = thinkstats2.Hist(lengths)
        observed = np.array(hist.Freqs(self.values))
        expected = self.expected_probs * len(lengths)
        stat = sum((observed - expected)**2 / expected)
        return stat

    def MakeModel(self):
        """Build a model of the null hypothesis.
        """
        firsts, others = self.data
        self.n = len(firsts)
        self.pool = np.hstack((firsts, others))

        pmf = thinkstats2.Pmf(self.pool)
        self.values = range(35, 44)
        self.expected_probs = np.array(pmf.Probs(self.values))

    def RunModel(self):
        """Run the model of the null hypothesis.

        returns: simulated data
        """
        np.random.shuffle(self.pool)
        data = self.pool[:self.n], self.pool[self.n:]
        return data

In [5]:
def RunTests(live, iters=1000):
    """Runs the tests from Chapter 9 with a subset of the data.

    live: DataFrame
    iters: how many iterations to run
    """

    firsts = live[live.birthord == 1]
    others = live[live.birthord != 1]

    n = len(live)

    # compare pregnacy length for first child and other childs
    data = firsts.prglngth.values, others.prglngth.values
    ht = DiffMeansPermute(data)
    p1 = ht.PValue(iters=iters)

    # compare weight of first child and other childs
    data = (firsts.totalwgt_lb.dropna().values, others.totalwgt_lb.dropna().values)
    ht = DiffMeansPermute(data)
    p2 = ht.PValue(iters=iters)

    # test correlation between pregnancy length and child weight
    live2 = live.dropna(subset=['agepreg', 'totalwgt_lb'])
    data = live2.agepreg.values, live2.totalwgt_lb.values
    ht = CorrelationPermute(data)
    p3 = ht.PValue(iters=iters) 

    # compare pregnancy lengths by using chi-squared
    data = firsts.prglngth.values, others.prglngth.values
    ht = PregLengthTest(data)
    p4 = ht.PValue(iters=iters)

    print('%d\t\t%0.2f\t\t%0.2f\t\t%0.2f\t\t%0.2f' % (n, p1, p2, p3, p4))

In [6]:

preg = nsfg.ReadFemPreg()
live = preg[preg.outcome == 1]

iters = len(live)
print('%s\t%s\t%s\t%s\t%s' % ("Sample Size", "PVal Preg lnth", "PVal Wght", "PVal corr", "PVal Chi"))
for _ in range(10):
    samples = thinkstats2.SampleRows(live, iters)
    RunTests(samples)
    iters //= 2

Sample Size	PVal Preg lnth	PVal Wght	PVal corr	PVal Chi
9148		0.16		0.00		0.00		0.00
4574		0.10		0.01		0.00		0.00
2287		0.25		0.06		0.00		0.00
1143		0.24		0.03		0.39		0.03
571		0.81		0.00		0.04		0.04
285		0.57		0.41		0.48		0.83
142		0.45		0.08		0.60		0.04
71		1.00		0.81		0.38		0.69
35		0.41		0.14		0.11		0.00
17		0.13		0.23		0.93		0.00


<h3>Conclusion: </h3>
As expected, tests that are positive with large sample
sizes become negative as we keep on reducing sample size.  However the pattern is
not consistent, with some positive tests even at small sample sizes.