Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a data set that includes total population figures #7

Closed
cliftonmcintosh opened this issue Jun 25, 2018 · 16 comments
Closed

Add a data set that includes total population figures #7

cliftonmcintosh opened this issue Jun 25, 2018 · 16 comments

Comments

@cliftonmcintosh
Copy link
Member

cliftonmcintosh commented Jun 25, 2018

The wazimap framework will only display data when there is a data set that includes total population figures. Find a data set that has full population figures for each of the bodies at each of the new levels for federal Nepal. Data at higher levels can be derived from data at the local level. Create the appropriate table for these data and add them to the profiles and to the view templates.

See https://github.com/Code4Nepal/nepalmap_app/blob/dev/wazimap_np/demographics.py#L45-L48 as an example of how total population is used.

The data at http://cbs.gov.np/sectoral_statistics/population/Population%20of%20753%20Local%20Units should work.

@cliftonmcintosh cliftonmcintosh changed the title aAdd a data set that includes total population figures Add a data set that includes total population figures Jun 25, 2018
@cliftonmcintosh
Copy link
Member Author

@ravinepal

Do we have a data set that includes total population at the local level for the new local bodies?

@ravinepal
Copy link
Member

ravinepal commented Jul 11, 2018

thanks, @cliftonmcintosh.

yes, here are CBS files:

  1. Province data
  2. Districts
  3. GauPalikas/Municipalities and Wards

Let me see if @Crackjack has suggestions or know how to efficiently scrape data out of the website.

In this website created by Ministry of Federal Affairs and General Administration: http://103.69.124.141/
there is data on: population of an admin level and area. Data is in Nepali. I'll reach out to a contact to see if there's data in English available.

also, @pratimakandel - would you be interested in helping us scrape/translate data from http://103.69.124.141/ in a Google doc? I can share more details on how to do it if you are interested. Pls let me know via email or by commenting here.

@openrijal
Copy link
Collaborator

the links you pointed to are PDFs and the website that is Nepali is build on Flash or something similar, the links do not change

if the data for only Province Level is enough, we can use the site, else need to figure out a way to grab data from the PDF.

I'm looking into it, will give some answer in a couple of hours.

@nikeshbalami
Copy link
Collaborator

Scrapping from PDF will be easy because its already in English, I have already scrapped Districts and Provinces data.

We need to work to scrap Local Units: http://cbs.gov.np/sectoral_statistics/population/Population%20of%20753%20Local%20Units

@cliftonmcintosh
Copy link
Member Author

cliftonmcintosh commented Jul 12, 2018

If we want to display local level information, then we will be required to have local level population figures. A basic rule in the wazimap framework is that we must have at least one data set that includes total population for the lowest geographic/political unit that we wish to display.

Also, the figures for higher level can be derived from the lowest unit since we can just add together the values from the local level to get the district and higher levels, so we only really need the data from the local level.

@ravinepal
Copy link
Member

thanks, @cliftonmcintosh. I'm scraping data. Here's province level population and household data - I submitted the PR to data repo and requested you to review:

CodeforNepal/data#13 (FYI @Crackjack)

@ravinepal
Copy link
Member

@nikeshbalami - can you submit pull request with district level data here: https://github.com/Code4Nepal/data/

I'll try to scrape local level data.

@cliftonmcintosh - then we will have data for all levels.

@cliftonmcintosh
Copy link
Member Author

If we have the local level, we don't need data from any other level because we can just add up the local units to get the district level and add up that to get provinces, etc. That's probably preferable because then we know the data are consistent.

@nikeshbalami
Copy link
Collaborator

I started scrapping local levels, need a couple of nights to complete the whole.

@openrijal
Copy link
Collaborator

what tool are you using? @nikeshbalami

would it be faster if all of us run the tool and use our computing time and process different levels?

@nikeshbalami
Copy link
Collaborator

I use Tabula for the basic PDF scrapping @Crackjack

Here I have submitted a pull request of 3 districts: CodeforNepal/data#14

Have created the separate folder for Federal Datasets>Polutation and HouseHold>District Name>Local Units.., This pattern will help us for the easy aggregation to find out Districts and Province level data.

@ravinepal I think you need to remove the comma from those total numbers, cuz in CSV columns is represented by comma thus having a comma in amount and number may throw an error if someone downloads the open the dataset in excel or others.

@cliftonmcintosh
Copy link
Member Author

cliftonmcintosh commented Jul 12, 2018

The data that @nikeshbalami is collecting at https://github.com/nikeshbalami/data/tree/f398ee21ec2bd03d9047f68bb519dd0faa45a4e1/Federal%20Data/753%20Local%20Unit%20Population%20and%20HouseHold are exactly the sort of thing we need. From that we can derive population for all levels above the local level. I would suggest that people concentrate on completing that data set. If team members produce files that are exactly the same shape, it will make them easier to convert to something that can be used for the federal NepalMap project.

Of course, even though the map project doesn't need it, there may be other reasons to scrape data for other levels.

@nikeshbalami
Copy link
Collaborator

@openrijal
Copy link
Collaborator

openrijal commented Jul 14, 2018

local levels are actually divided into 4 categories (Metropolitan, Sub-Metropolitan, Municipality, Rural Municipality)

To get the statistics for districts, these categories under them should be summed up
To get the statistics for provinces, the districts should be summed up
To get the statistics for country, the provinces should be summed up

@ravinepal
Copy link
Member

I think this is done too, yeah?

@cliftonmcintosh
Copy link
Member Author

cliftonmcintosh commented Sep 7, 2018

The data has not been imported into nepalmap_federal. It is now available in the data project here:
https://github.com/Code4Nepal/data/tree/master/Federal%20Data/753%20Local%20Unit%20Population%20and%20HouseHold.

It still needs to be transformed into a table in nepalmap_federal and added to a view. That can all be covered in #11.

cliftonmcintosh pushed a commit that referenced this issue Jan 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants