Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in convert csv to dwca #690

Closed
patkyn opened this issue Jul 2, 2021 · 3 comments
Closed

Issue in convert csv to dwca #690

patkyn opened this issue Jul 2, 2021 · 3 comments
Assignees

Comments

@patkyn
Copy link
Contributor

patkyn commented Jul 2, 2021

Currently, new or updated records show individualCount and coordinatePrecision as float. For eg: https://biocache.ala.org.au/occurrences/0e76fc74-269c-4681-9c93-20ba6859849b

Existing records that has not been updated still remains unchanged. For eg: https://biocache.ala.org.au/occurrences/f45c2bbb-9bda-400b-923b-6dd59ac326f2

The reason why this happens is:
The csv to dwca converter should read in values as string rather than converting it to float. For eg, Bionet job extract the sightings as string but during the reading in dwca-dr, this has been converted to number in the generated dwca

image

Another issue with the csv to dwca converted is if there's a multiline string. This happens for dr344. A record has been split into 2 records because of multiline in a field.

image

@sadeghim
Copy link
Member

sadeghim commented Jul 7, 2021

I see that for QM, the multiline string is processed properly in Solr:
"preparations":["Pin\nSlide"],

But I cannot see this field in https://biocache.ala.org.au/occurrences/ca374cf5-b1c9-4ce2-9d84-ea434bf49950 which maybe is a bug for biocache hub as I see it in the ws: https://biocache.ala.org.au/ws/occurrences/ca374cf5-b1c9-4ce2-9d84-ea434bf49950

@nickdos any thought on this?

@nickdos
Copy link

nickdos commented Jul 7, 2021

I see that for QM, the multiline string is processed properly in Solr:
"preparations":["Pin\nSlide"],

But I cannot see this field in https://biocache.ala.org.au/occurrences/ca374cf5-b1c9-4ce2-9d84-ea434bf49950 which maybe is a bug for biocache hub as I see it in the ws: https://biocache.ala.org.au/ws/occurrences/ca374cf5-b1c9-4ce2-9d84-ea434bf49950

@nickdos any thought on this?

Added AtlasOfLivingAustralia/la-pipelines#478

@sadeghim
Copy link
Member

sadeghim commented Jul 7, 2021

Closing this as the main issue here is fixed.

Remaining problem has been addressed in another issue: AtlasOfLivingAustralia/la-pipelines#478

@sadeghim sadeghim closed this as completed Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants