No description, website, or topics provided.
OpenEdge ABL Python Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


This project contains Python scripts for studying Location Based Social Networks and disease simulation.

Description of files:

iofiles.tar : contains input files used by some scripts, and a few results
  * citydata/
      - census_raw.csv = original population data
      - census.out; census.p = dictionary city -> population
      - city_coor.csv = original coordinates data
      - city_coor.out; city_coordinates.p = dictionary city -> (latitude,longitude)
      - = parser for population data
      - = parser for coordinates data
      - states.p = dictionary state_name -> state_abbrv

  * google_data/
      - all.csv = google flu trends data 2004-2012(May)
      - match7-365d-norm-labeled.csv = May 10, 2009 to May 2, 2010, normalized
      - match7-norm = same as above, no date label

  * out/
      - city_list.p = list of cities
      - coordinates.p = dictionary city -> (latitude,longitude)
      - locations.p = dictionary city -> list of user_names
      - school_contacts.p = list of number of contacts per day (from school
      - contact_dist.out = list of number of contacts per day (from gowalla data)
      - trans_prob.csv = transition probability matrix

  * results/
      * gowalla/
          - avg-11.csv = 52-week results from gowalla network
          - avg-11-norm.csv = 52-week results, normalized
          - avg-11-norm-labeled.csv = 52-week results with date and states labels
          - matrix-*.out = 365-day simulation results
          - scores11 = sorted distances for every 52-week subset against google flu
          - statescores-11 = state-by-state distance against google flu on
            selected week (this is for starting week 281)
              <state_index> <state_abbrv>:<distance_score>
      * perm/
          - (same naming convention as above but results for permuted transition
      * rand/
          - (similar files but results for randomized transition probabilities)
          - matrix0-1-randomT.csv = randomized transition probability matrix : executes faster_sim on grid using runCmd
    - outputs the resulting matrices in a results/ directory : modified disease simulation using vectors for disease compartments
    - outputs [matrix$] with the incidences for each state at each time step
      (365 timesteps x #states)
#python [n] [prob] out/school_contacts.p citydata/states.p citydata/census.p out/trans_prob.csv citydata/city_list.p [google_data/all.csv] [matrix$] : contains functions used in simulation

data_analysis/ : computes the 52-week tally for each state
    - outputs 52-week tallies and normalized values
#python matrix$count n avg-matrix-$count

data_analysis/ : computes the euclidean distance between the
resulting normalized matrix and all possible google matches
    - outputs list of distances, the index corresponds to the week number from
      the google data
#python [n] [prob] out/school_contacts.p citydata/states.p citydata/census.p out/trans_prob.csv citydata/city_list.p [google_data/all.csv] [matrix$]

This section needs to be updated: : parses raw gowalla data to extract desired fields
#python [gowalla_raw] [output] : fixes location info from citydata (census/coord)
    - outputs not_found.out for unfound cities
#python location_data.raw census.p city_coordinates.p : fixes timestamps to standard Eastern timezone

citydata/ : directory contains raw data for census, city coordinates, parser
for raw files and pickled data
    **TODO: need to update raw census data, check mismatch with city_coord : extracts info about checkins
    - time_diff, distance, year09, year10, freq
#python data_info.out
    - extracts info from data_info.out
    - time_diff, distance, total_checkins, total_time, speed

chartmaker/ : directory contains scripts for producing charts : remove freq/problem users
#python [location_data] freq.out removed.out : extracts city based info
    - locations.p: maps city to list of users that visited the city
    - coordinates.p : maps city to avg (lat, long) coordinates from all locations
      in that city
    - location_stats.out : city/state user statistics
#python location_stats.out : extracts user checkin history
    - user_checkins.out : maps user to list of (city, date) checkins sorted
    - finds and separates Austin users
#python user_checkins.out austin.out : constructs location-based network from user checkin history
    - uses LocationGraph from
    - parses user_checkins.out to rebuild user checkin history
    - for each user, traverse history, add edge from one city to the next
    - for every city node created, set coordinates from coordinates.p
    - for each Austin walker, set edge weights as .5
    - add epsilon (5) weight between all pairs of city nodes
    - save network as gowalla_net
    - outputs network info
#python user_checkins.out austin.out coordinates.p gowalla_net : LocationGraph data structure uses networkx, directed graph : executes disease simulation
#python [time_steps] [init_n] [n] [prob] locations.p census.p gowalla_net sim.out