-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
January 2019 DAC codelist updates #283
Conversation
These updates are all from the DAC XML source, available here: https://webfs.oecd.org/crs-iati-xml/Lookup/
Related IATI discuss post: |
@andylolz for the BAs benefit, would you mind providing a diff between your import and the original XML file (if you have something like that available that is)? Do you use the script you suggested we'd use in another PR? Thank you! |
Nope – I did this manually :) The script in #172 processes the DAC Excel file (well, it processes some CSV on datahub.io, but that comes from the DAC Excel file). @bill-anderson appears to suggest the Excel file should not be used ("more sustainable solution" etc) so this PR uses the XML instead.
A diff is maybe tricky because the DAC XML file (available here) is just one big file. But I can explain the steps I went through. I did the following bits of cleanup:
I think that’s everything. Here’s what I haven’t done:
The diff in this PR shows that quite a lot of stuff has changed. I guess that’s mostly because the source has changed from Excel to XML, and there are some mismatches between the two. I think it will be difficult to verify and merge this PR for that reason. If the goal is to eventually use XML from the DAC as the source for these replicated codelists, then I’d be tempted to go back to the DAC technical team with a list of stuff to fix at their end, and use the Excel file as the source in the interim. I’m very pleased you’re looking at this, because it’s really important that these replicated codelists are kept in sync with source. For instance, a validator might say that a dataset is invalid because a bad sector code is used, when in fact the problem might be that the IATI replicated Sector codelist is out of sync, and doesn’t include a complete list of sector codes. A publisher could also be scored down on the Aid Transparency Index for the same reason. Or an aid management system might rely on these codelists for interpreting published IATI data. Anyway – I’d be happy to discuss next steps. |
@andylolz fab, thanks! Petya and the BAs have this to check on their todo list, it'll be checked during this week! |
@andylolz thanks so much for your work on this and clarifying the steps you have undertaken. The crucial bit here is to again get confirmation from the OECD DAC that the Excel and XML include exactly the same content which at moment is not the case! We were promised that the XML will be in sync with the source file. I have copied you in the email I sent to Valerie from the DAC so that we get an answer from them and be able to proceed with the changes as soon as possible. Thanks again! |
We have now received a response from the OECD that the XML files has been updated and both Excel and XML files have been pulled from the same source.
From a quick look of the difference I have identified before the codelists are now identical in the Excel and XML files so I think we can use the updated XML to update the codelists on the IATI website. @andylolz Is there a way of easily re-doing what you have done so far with the updated XML file? Then I can review the pull request. If it requires a lot of manual work for you, then I can look into making the comparison and adding the pull requests. |
@PetyaKangalova no problem – I’ll try and get this sorted today. |
Thank you Andy! |
Thanks @andylolz ! I am off on Monday and in meetings all of Tuesday but should be able to review mid-next week! Thanks again! |
Okay – PR updated using the latest (updated) version of DAC XML. I followed the same steps described above. |
@andylolz thanks again for redoing the commit. Really appreciate it! It took me a while to review all the changes as there are quite a lot of them! See summary below:
Next steps:
|
There are indeed! Great work reviewing!
Kk, done.
Oh, good spot! The same applies to 41050 (Flood prevention/control), which has also disappeared. Also, the following withdrawn sector codes have disappeared:
They may have been replaced by other codes, but the idea is they’re supposed to remain in perpetuity as |
I’ve mentioned elsewhere that I’m in favour of scrapping this changelog. I’m unconvinced it’s worth your time. It wasn’t updated for the last DAC codelist update (see: IATI/IATI-Guidance#312) so it’s only a partial list of changes anyway.
Okay – this is very generous of you, but again I don’t think this should be standard practice. Tool providers should be keeping an eye on discuss, or routinely pulling from source. That’s the system as documented. If they start relying on updates from you then that just becomes an extra overhead for you. |
Thank you!
Thank you for flagging. I missed this one!
Yes, I agree! I also noticed that there were a few new 'withdrawn' as of 2015 that were not on the IATI list. As they are already withdrawn I was not so concerned but it means the XML is not consistent. On your point for the changelog I agree that it is a lot of effort. However, this time round there are quite a lot of new codes and it will be important to alert people which ones those are and also make sure organisations can start using them via the various publishing tools. Hence, dropping them a quick email to speed up the process, but it is indeed their responsibility of the tool providers to keep them up-to-date. Waiting to hear from Valerie and will then action the changes! |
Excellent – all good!
Yes that’s true, but I’d expect DAC to have a better record of withdrawn codes than IATI (since IATI only started recording these relatively recently). So withdrawn codes in the XML that were not previously known to IATI are probably a good thing :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving the full pull request following a few revisions and updated commits. Many thanks @andylolz
Leaving for @IATI/devs to merge and deploy next week.
<codelist-item status="active"> | ||
<code>0</code> | ||
<name> | ||
<narrative>NON FLOW ITEMS</narrative> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling this code is deliberately excluded from the IATI replicated codelist. @bill-anderson can confirm or deny.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andylolz @PetyaKangalova is this above comment here holding up the merge or can it still go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha – here’s why I think this is deliberately excluded: #16 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea whether that means they should or shouldn’t be included. I shall leave it with you to decide!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree that finance type '0- non flow items' does not fit within the activity standard as it does not describe type of finance at activity level. It is much wider than that. I would propose that we do not replicate- waiting for confirmation from @bill-anderson so @samuele-mattiuzzo please hold off merging until then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just talked to @bill-anderson and the view is that we keep this code in, the reason being that we replicate all 'OECD DAC' codelists exactly as they are in the table provided by the DAC. @samuele-mattiuzzo this means we can continue merging this request.
However, what will be useful is to add a note in the Finance Type and Finance Type (Category) to note that non-flow items are not activity specific and we do not expect the codes to be used when reporting finance types in the IATI activity standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really sure about the note idea.
If it’s important enough that it’s a problem, then you can either add a rule to the ruleset, or leave the codes out.
I guess in general, I think it’s best to avoid situations where the standard strongly advises something, but doesn’t enforce in any way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: "or leave the codes out."
As already mentioned the standard replicates third-party code lists. This is a well-established principle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As already mentioned the standard replicates third-party code lists
Not for the past year or so! That’s what this PR aims to fix.
In fact in the case of non-flow items, these codes have been missing from the replicated codelists since 2014 (see: #16).
Happy to go with whatever to get this moving forward this week.
I’ve added a summary of changes in the PR description. |
Seems like the conclusion is: Merge this, and then add a note to the FinanceType codelist (about non-flow items) in a new pull request. @PetyaKangalova is that right? |
⭐ 🌟 ⭐ |
These updates are all from the DAC XML source, available here:
https://webfs.oecd.org/crs-iati-xml/Lookup/
This replaces #249.
To summarise the changes here:
Aid Type
Codes added:
H03
- Asylum-seekers ultimately acceptedH04
- Asylum-seekers ultimately rejectedH05
- Recognised refugeesSector Category
Codes added:
123
- Non-communicable diseases (NCDs)Sector
Codes added:
11250
- School feeding12310
- NCDs control, general12320
- Tobacco use control12330
- Control of harmful use of alcohol and drugs12340
- Promotion of mental health and well-being12350
- Other prevention and treatment of NCDs12382
- Research for prevention and control of NCDs15190
- Facilitation of orderly, safe, regular and responsible migration and mobility16070
- Labour Rights16080
- Social Dialogue24050
- Remittance facilitation, promotion and optimisation25030
- Business development services25040
- Responsible Business Conduct43060
- Disaster Risk Reduction43071
- Food security policy and administrative management43072
- Household food security programmes43073
- Food safety and quality74020
- Multi-hazard response preparedness93011
- Refugees/asylum seekers in donor countries - food and shelter93012
- Refugees/asylum seekers in donor countries - training93013
- Refugees/asylum seekers in donor countries - health93014
- Refugees/asylum seekers in donor countries - other temporary sustenance93015
- Refugees/asylum seekers in donor countries - voluntary repatriation93016
- Refugees/asylum seekers in donor countries - transport93017
- Refugees/asylum seekers in donor countries - rescue at sea93018
- Refugees/asylum seekers in donor countries - administrative costsCodes withdrawn:
41050
- Flood prevention/control74010
- Disaster prevention and preparednessFinance Type Category
Codes added:
0
- NON FLOW ITEMSFinance Type
Codes added:
1
- GNI: Gross National Income2
- ODA % GNI3
- Total Flows % GNI4
- Population