In [46]:
%pylab inline
import pandas

Populating the interactive namespace from numpy and matplotlib


In [53]:
data = pandas.read_stata("ZA1715_v1-0-1.dta")

    

```
The variables we want to compare are:
v563 Common Defense
v566 Common Foreign Policy 
v565 Single Currency 
v567 European Government

To do this we need the european weighting from Footnote 12: 
In calculating the correlations, national weights were applied to all observations so as to provide
a representative sample of the EU population. In addition, an identical analysis was conducted that
excluded all responses of "don't know". The results were very similar to those presented here. In in-
terpreting the correlations, remember that discrete variables allow only a crude representations of the
actual continuum of responses to each question. This tends to attenuate the magnitude of the corre-
lations among the variables (Kim and Mueller 1978, 74).

From the codebook:
5  NATION I (UK As one variable.)            
6  NATION WEIGHT I                                         
7  NATION II (NI and GB separated.)                          
8  NATION WEIGHT II                                        
9  EUROPEAN WEIGHT

(We likely only need 9, but it's easier to leave the rest in our set for now.)
```

In [68]:
# Import the data into a new data frame and give saner column names.
Table1_Data = data[['v5','v6','v7','v8','v9','v563','v565','v566','v567']]
Table1_Data.columns = [
    'Nation_1',                                                                 
    'Weight_1',                                                                 
    'Nation_2',                                                         
    'Weight_2',                                                                 
    'European_Weight',                                                          
    'Common_Defense',                                                           
    'Single_Currency',                                                          
    'Common_Foreign_Policy',                                                    
    'European_Government']

Table1_Data.columns

Index([u'Nation_1', u'Weight_1', u'Nation_2', u'Weight_2', u'European_Weight',
       u'Common_Defense', u'Single_Currency', u'Common_Foreign_Policy',
       u'European_Government'],
      dtype='object')

```
From Footnote 11, Page 341
'Each of the four question asked the respondent if she were for or against implementing the par-
ticular proposal between the twelve countries of the EC by 1992. I coded a response of "against" as
(0), "don't know" as (0.5), and "for" as (1).

Need to recode data as:

"against" (0) 
"don't know" (0.5)
"for" as (1)

```

In [74]:
# One column at a time:
# data.events = data.events.fillna("")
# (Can do it using apply to do it in one pass, but not sure that there are no NAN's elsewhere in the data.)

# Score "don't know" = 0.5
#Table1_Data.Common_Defense = Table1_Data.Common_Defense.fillna("0.5")
#Table1_Data.Single_Currency = Table1_Data.Single_Currency.fillna("0.5")
#Table1_Data.Common_Foreign_Policy = Table1_Data.Common_Foreign_Policy("0.5") 
#Table1_Data.European_Government = Table1_Data.European_Government("0.5")

Table1_Data.Common_Defense.describe


<bound method Series.describe of 0            FOR
1        AGAINST
2            NaN
3            FOR
4            FOR
5            FOR
6            FOR
7            FOR
8            FOR
9        AGAINST
10           FOR
11       AGAINST
12           FOR
13           FOR
14           FOR
15           FOR
16           FOR
17           FOR
18           FOR
19       AGAINST
20           FOR
21           NaN
22       AGAINST
23           FOR
24           FOR
25           NaN
26           FOR
27           NaN
28           NaN
29           FOR
          ...   
11764        NaN
11765        FOR
11766        NaN
11767        FOR
11768        NaN
11769        NaN
11770        NaN
11771        NaN
11772        NaN
11773        NaN
11774        NaN
11775        FOR
11776        NaN
11777        FOR
11778        NaN
11779        NaN
11780        FOR
11781        NaN
11782        FOR
11783        NaN
11784        NaN
11785        FOR
11786        NaN
11787        FOR
11788        FOR
11789        FO