Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Present total population and population by gender #11

Closed
cliftonmcintosh opened this issue Jul 18, 2018 · 21 comments
Closed

Present total population and population by gender #11

cliftonmcintosh opened this issue Jul 18, 2018 · 21 comments
Assignees

Comments

@cliftonmcintosh
Copy link
Member

cliftonmcintosh commented Jul 18, 2018

Once the data for population has been integrated as part of #7 , present the total population with a breakdown by gender.

This includes:

Data sets for this are here: https://github.com/Code4Nepal/data/tree/master/Federal%20Data/753%20Local%20Unit%20Population%20and%20HouseHold
This only has provinces 1 and 2 right now, though.

@ravinepal
Copy link
Member

@Beegrekokto, @Soneeka, @sunitagajurel, @theonlyNischal - pls let us know if one of you are interested in leading on this issue? thank you.

@ravinepal
Copy link
Member

dear @Beegrekokto, @Soneeka, @sunitagajurel, @theonlyNischal. i hope all is well. i know some of you had exams. i wanted to follow up to see if one of you are interested in leading on this issue?

@cliftonmcintosh
Copy link
Member Author

#16 needs to be addressed before this issue can be worked on.

@amitness amitness self-assigned this Sep 29, 2018
@amitness
Copy link
Member

amitness commented Oct 7, 2018

@Crackjack @cliftonmcintosh

The instructions to load data into postgres database seems to be out of date. There is no simpletables folder in sql directory. Can someone guide me with instructions to load the database dump?

image

@amitness
Copy link
Member

amitness commented Oct 7, 2018

Also, I'm getting an error regarding WhiteNoise when running the app with python manage.py runserver. @Crackjack Did you get any such error when setting up?

image

And had we used Python 2 or 3 for this app?

@cliftonmcintosh
Copy link
Member Author

@amitness

There are no simpletables yet. You should still be able to load the database tables. You can either run that command or skip it. Either way you should be able to proceed to the next steps successfully.

We could remove that from the README, but we would need to remember to add it back in as soon as we included any simpletables.

@cliftonmcintosh
Copy link
Member Author

cliftonmcintosh commented Oct 7, 2018

@amitness

Wazimap is not compatible with python 3. I do not get that exception for WhiteNoise with python manage.py runserver, and the homepage loads for me. Of course, since we have no more data, that is as far as it goes!

nepalmap_federal

@amitness
Copy link
Member

amitness commented Oct 7, 2018

@cliftonmcintosh Thanks. The problem was due to wazimap dependancy on a library called whitenoise. When I installed from our requirements.txt, the latest version of whitenoise was installed which was incompatible with how wazimap was using that library. I was able to fix it by uninstalling it with pip uninstall whitenoise and downgrading to a lower version of the library pip install whitenoise==3.3.1 .

@amitness
Copy link
Member

@cliftonmcintosh @ravinepal I converted the data from local structures to the format we need for wazimap and when mapping the geography.sql codes to that, I have found some inconsitencies for some VDCs.

For example, for district 'Nawalpur', we have mappings to only 7 local levels in geography.sql

image

But, in the csv data repository, we have 10 VDCs for 'Nawalpur'.

image

Can I know the source from which the geography.sql mapping for province -> district -> local level was created? I need to cross-verify the data to proceed. The data shows that we have missing mappings.

I was thinking of recreating the geography.sql file from the data that was scraped but there is some interesting data such as:
Province 5:
Kapilbastu: Mayadevi Gaunpalika.csv
Nawalpur: Mayadevi Gaunpalika.csv
Rupandehi: Mayadevi Gaunpalika.csv

All 3 districts have same name VDC with different populate readings. Can I trust the scraped data as the ground truth and recreate a mapping from it for geography.sql?

@cliftonmcintosh
Copy link
Member Author

cliftonmcintosh commented Oct 28, 2018 via email

@openrijal
Copy link
Collaborator

@amitness the geography.sql was crrated out oglf geojson and topojson files from different sources, plus a few other sources.

Nawalpur is one of the districts which was renamed from "nawalparasi". There might be descripecancies, if you find a single source of Truth, we can use that.

@amitness
Copy link
Member

@cliftonmcintosh It's working now. I've skipped the local levels with discrepencies for now. We can discuss and solve it in #19. Please review the PR.

image

I also had found a new problem. The URL pattern used by Wazimap to detect the geocode from the URL was like this:

https://nepalmap.org/profiles/vdc-1425-bhadgau-sinawari/

Here vdc-1425 was splitted into 'vdc' and '1425' as geo_level and geo_code respectively.

But, for federal data, we have geo_code in format such as 'pro-06'.

http://127.0.0.1:8000/profiles/province-pro-6-province-no-6/

So, wazimap was treating 'province' as geo_level and 'pro' as the geo_code instead of 'pro-6' causing errors. This was hardcoded in wazimap app itself. So the urls for provinces, district and local level were not working

To fix it, I created a new urls.py, imported all urls from wazimap, replaced this specific pattern with our new pattern and set our urls as the default root conf url in settings.

@amitness
Copy link
Member

@Crackjack Can you help me out with this?

urlpatterns.append(
  url(
        regex   = '^profiles/(?P<geography_id>\w+-\w+-\w+)(-(?P<slug>[\w-]+))?/$',
        view    = cache_page(STANDARD_CACHE_TIME)(GeographyDetailView.as_view()),
        kwargs  = {},
        name    = 'geography_detail_country',
    )
)

This is the regex I replaced the existing Wazimap URL with. With this regex, it matches province, district and local levels.

http://127.0.0.1:8000/profiles/local-loc-4010-jhapa/
http://127.0.0.1:8000/profiles/district-dis-28-jhapa/

But, the below URL is not matched, since geo_level is 'country' and geo_code turns out to be 'NP-nepal' instead of 'NP' that we need.

http://127.0.0.1:8000/profiles/country-NP-nepal/

This is the original URL pattern that was present in Wazimap and it's hardcoded.

    url(
        regex   = '^profiles/(?P<geography_id>\w+-\w+)(-(?P<slug>[\w-]+))?/$',
        view    = cache_page(STANDARD_CACHE_TIME)(GeographyDetailView.as_view()),
        kwargs  = {},
        name    = 'geography_detail',
    ),

This matches country level but don't work with province, local and district level geo-codes that we are using.

Any way to modify the regex to match both of them?

@cliftonmcintosh
Copy link
Member Author

cliftonmcintosh commented Oct 30, 2018 via email

@amitness
Copy link
Member

@cliftonmcintosh Previously, I tried removing the dash in the geocodes. The URLs worked but the map for that region failed to load. So I had to revert that. I think it's because the geocodes in geojson was still the old one. I'll try renaming both and see if it fixes the problem.

@cliftonmcintosh
Copy link
Member Author

cliftonmcintosh commented Oct 30, 2018

Yes, they have to match the codes in the geo files, so they would need to be modified in those as well

@ravinepal
Copy link
Member

hi @amitness - just checking in to see how we can help here. we have received interests from new volunteers and i can request them to help.

@cliftonmcintosh
Copy link
Member Author

@ravinepal

The dashes in the geocodes has been corrected by #21.

From what I can tell, the most pressing issue is the data discrepancies described in #19. Those must be resolved in order for us to be able to present the basic population data.

@amitness
Copy link
Member

@ravinepal It would be great to get them involved. The major roadblock currently is #19.

@cliftonmcintosh
Copy link
Member Author

With @nikeshbalami's data updates and his guidance provided in a comment on #19, we should now have the ability to account for the missing data. @amitness has some tooling for processing data in a Jupyter notebook. See this comment: #19 (comment).

@ravinepal if one of the volunteers wants to try to process the missing data, they are welcome to do so. This would include:

  • including data for the missing localities that have now all been identified in the data set.
  • including zero values for all national parks, wildlife reserves and development districts.
  • updating the aggregated totals for all levels above the local level so that they include the new figures. For example, if a missing locality is added to Banke district, then the totals for that district needs to be updated to included the locality's figures. The totals for Province 5 would need to be updated to include the locality's figures, and the totals for Nepal would need to be updated to include the locality's figures.

@ravinepal
Copy link
Member

This has been resolved, I believe. @cliftonmcintosh - fine to close this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants