Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Countries and Their Acronym Aliases Showing Up As Separate Nodes on Graph #148

Open
ikartik90 opened this issue Mar 28, 2020 · 1 comment

Comments

@ikartik90
Copy link

Describe the bug
Countries that are also addressed by their acronyms, such as UK for the United Kingdom are showing up as two different international destination nodes on the travel cluster graph. You can see "UK" and "United Kingdom" as two separate nodes in the screenshot below. Similarly, "US" and "USA" are showing up as two separate international destination nodes on the graph. This is leading to skewing and misrepresentation of information.

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'www.covid19india.org'
  2. Click on 'Clusters' tab
  3. Select 'Travel' cluster filter
  4. You would notice "UK" and "United Kingdom", and "USA" and "US" as separate nodes on the graph

Expected behavior
Countries and their aliases should be merged and represented as a single node.

Screenshots
Screenshot 2020-03-28 at 3 12 22 PM

Additional context
The problem is probably cropping up due to the crawling of CSV values in Patient Notes and not processing them for alias values before mapping them on to nodes.

Related Issue
#147

@sibeshkar
Copy link
Collaborator

This repo is using the NLP api for making structured travel data out of unstructured notes from https://github.com/NirantK/coronaIndia. Possible to report it on that repo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants