Skip to content

A study on the hkupopy web site contained Legco exit-poll and rolling survey results

Notifications You must be signed in to change notification settings

kwcityhk/hkupop-legco

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Using hkupop-legco spss datasets

  • the io link is : https://kwcityhk.github.io/hkupop-legco/

  • the io link edit page is via setting (auto page) or : https://github.com/kwcityhk/hkupop-legco/generated_pages/new

  • the wiki is here: https://github.com/kwcityhk/hkupop-legco/wiki

  • Just want to use it?

    • if you were using PC and have pspp installed
    • ensure that PSPP is on your system path
    • just copy the hkupop to D:\hkupop-legco\hkupop (using github download to D:\ would do it)
    • those batch file under D:\hkupop-legco\hkupop\ would generate all files and some statistics
    • it would have enough sample for you to modify and test
    • however due to issues with addfiles, the all dates or not by 5 days analysis have ???? in value label
  • Quick sampling

  • How really to analysis it:

    • (To be updated)

    • Start to recode it first (see the recode sub-directory under pspp-run)

    • under these variables and questionaire (some information are my trying to understand it in the .sps files under recode directory)

    • Deal with frequencies

    • Crosstab, graph, plot, ...

    • Weight, non-response rate, ... issues (see issues below)???

    • Then ... (loop, data -> program -> statistics -> check -> analysis -> political analysis -> loop back)

  • Some issues involved

    • General issues: examles http://andrewgelman.com/2016/06/24/brexit-polling-what-went-wrong/ is a good essay which you should read, in summary 1. Survey not representative sample 2. Survey responses not voting intentions 3. "Shift in attitudes during the last day"; 4. "Unpredicted patterns of voter turnout", especailly the pattern 5. sampling variability.
    • Having said that the prediction is not as bad, read the article
    • It is not difficult but also not easy

  • Purpose

    A personal study on the hkupopy web site contained Legco exit-poll and rolling survey results.

    I personally do not recommend to use csv. One of the major obstacles of using other people's survey is the metadata e.g. what is missing data, what is the questions? what is "1" meant (M or F) etc. TAKE AGE1 AND AGE2 as an example. You have to combine the two togeher to form a full age group. I knew he said they have that format as some kids may want it. May be, but having done this kind of jobs since 1980 for academics, sorry I do not agree. Hence, I do not even bother to find out where the .csv files are.

    I understand SPSS is good as it is easy for non-programmer. This will use here mainly due to availability of PSPP.

    Still, for visualaistion and better programming language ...I would suggest something like R etc. or if you have the expensive SAS (do they have free one now, not sure). Having said that, given the real SPSS now has python etc. languages, may be it is ok. Not sure about the PSPP though. SPSS is so COBOL like it is ... well, better the tools you know then the tool you don't I guess. (Acutally after trying to do a few simple thing it turns out the easy one is easier in PSPP but not in R. But the harder stuff is easier in SAS (at least in 1980s) and R.

    BTW, R cannot read the .SAV file for some reasons. To be investigated. I have to use the csvy file.

    Well, may be it is just my rusty SPSS. Also, unlike 40 years ago, no one paid me to do this or 20 years ago, try to help others, let us see how much incentives for doing this old software.

  • Source

    The data come from : https://www.hkupop.hku.hk/english/resources/dataset/lc/index.html

    In particular I add some checksum (sha2 under mac os x); see the .md file there for details

  • Disclaimer and License

    This work has nothing to do with my work and the organisation I work in. Absolutely nothing.

    I have no relationship with the programme or HKU related to this programme. There is no gurantee or expectataion of any correctness etc. on its analysis.

    Any work done here is for the wider community and civil society. It has nothing to do with politics per sec and any comment related to that would be deleted. It is all about open data.

    For the copyright of the data, you have to refer to hkupop. I just heard today (9 July 2016 in a seminar) from Mr Robert Chung they release the data for pubic use but do not mention the license. You can search for that web site for that information.

    MIT License for the code and anything under my copyright

  • How to use the data

    • I will try R later, but as Mr Chung has said about PSPP (and the dataset format is in SPSS .dat), I would try this first
  • PSPP installation

    • Refer to

      https://www.gnu.org/software/pspp/

    • Installation under Windows 7

      https://www.gnu.org/software/pspp/get.html

      choose 32 bit if not sure; 64 bit might be faster; I use PSPP_0.10.1_2016-04-01_32bits (even though under Windows 7 32 bits) ()

    • Installation under macOS

      under macOS, I am brave and use the latest version under El Captian

      sudo port install pspp-devel
      
      • This assume you already install port, if no try the followings:

      • I do this as follows:

        sudo port self-update
        
      • you need to install X-windows (or X-servers) and also in general X-code and its command-line options ...

    • install under ubuntu 14.04.4 LTS desktop x64 and also ubuntu 16.04 desktop x64

      • just do this and I do NOT recommend to build from source; too many issues

         sudo apt-get upgrade
         sudo apt-get install pspp
        
      • you have to find those syntax editor and output windows; hidden it even better than macOS X-server!

  • PSPP running

    • As the file convention are different, you have to edit the sample jobs file directory

    • Also when download from windows use the default location Github under your Document folders, otherwise it may not work

    • Once again the file convention is different for different OS e.g. / vs \ and location and even home folder name is different

    • Running Graphic (Really have to find those other two windows I am afraid)

       psppire
      

      (Good old days graphic; but obviously not very friendly in sharing knowledge but good to create files! See tutorial later.)

    • Running command mode and this is how the sharing work in the longer term:

      cd pspp-run
      pspp crosstab-sample-1.sps -o crosstab-sample-1.output.pdf
      pspp crosstab-sample-1.sps -o crosstab-sample-1.output.pdf -e crosstab-sample-1.issue.txt
      
      

About

A study on the hkupopy web site contained Legco exit-poll and rolling survey results

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published