Skip to content

adnanwarsi/north-america-mosques

Repository files navigation

north-america-mosques

This is a set of python scripts to exctract mosques data from Google Places API run over the North American region.

Results of the scripts run in March 2017

https://public.tableau.com/profile/adnan.warsi#!/vizhome/UnitedStatesMosques-asofJan2020/Sheet1

The following sequence of scripts

with installed packages - geopy, googlemaps, io, json, math, openpyxl, os, re, requests, sys, time, xlwt

use the following to setup with packages dependencies

$pip install -r requirements.txt

01_generate_search_coords.py

Generate the radial search coordinates. The coordinates are placed across such that radial search circles will form a honeycomb lattice. The maximum distance across the coordinates is sqrt(3)*R, so that circles overlap enough to cover all area. The script creates a search_coords.xls file with coordiantes.

02_qualify_search_coords_tobe_incountry.py

this script runs over all coordinates in the search_coords.xls file generated by the previous script, to qualify if the coordinate is in the United States. A column added will indicate the qualifier.

Manual edit of file

filter out only the coordinates within the US and store the file as only_us_coords.xlsx

03_search_radial_all_coords_with_keyword.py

run this script on the only_us_coords.xlsx coords file 3 seperate times. One each with argument for keyword search : Mosque, Masjid & Muslim . each run will generate corresponding set of files in the directory places_search_data

04_combine_data_files_info.py

run this script with the folder (places_search_data) that was generated by previous script runs. It will create a json file composite_data_file.json which contains all place nodes and their data

05_create_xls_from_big_data_file.py

this script takes the composite_data_file.json to generate excel file with all the places and their fields.

06_scan_website_for_hints.py

run this script with the .xlsx file from the previous scripts to scan respective websites to determine if any of the sunni mosques are actually any other sect.

07_FourSquare_Stats_for_xls_data.py

run this script with the .xlsx file as argument. It queries FourSquare API centered at the Lat,Lon of each record with a 100m radius. If there is a result found, it extracts the 'userscount' to proxy for the foot-traffic at teh location.

08_scan_websites_for_email_contacts.py

this script scans the website, if exists, for each record to search for email address, if found, and record it.