Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added KB NL policy #9

Merged
merged 3 commits into from
Mar 29, 2024
Merged

Added KB NL policy #9

merged 3 commits into from
Mar 29, 2024

Conversation

bitsgalore
Copy link
Contributor

Second attempt (now only with modified CSV).

Note that the original CSV contains leading/trailing whitespace characters for some records. According to RFC4180 these can lead to problems, because they are considered part of the field:

  1. Within the header and each record, there may be one or more
    fields, separated by commas. Each line should contain the same
    number of fields throughout the file. Spaces are considered part
    of a field and should not be ignored.

So in this version I stripped those out. As a side effect this introduces some warnings when I try to validate it with CSVLint. These appear to be (at least partially) related to the fact that the first line of the file contains unstructured data (which it also complains about for the original file). Also I'm not entirely sure how reliable the CSVLint tool is, so I'll just leave it at this.

Hope this is useful.

@ross-spencer
Copy link
Collaborator

Hi @bitsgalore this is good work and the layout of the commits is helpful. Unfortunately, the current scripting breaks because it is looking for:

  • ,NAME OF INSTITUTION,COUNTRY ,POLICY ,POLICY URL,STRATEGY ,STRATEGY URL"

But the changes make it:

  • ,NAME OF INSTITUTION,COUNTRY,POLICY,POLICY URL,STRATEGY,STRATEGY URL

This is because of the use of front-matter in the CSV.

So, can you perhaps rebase and drop the last commit for now?

git rebase -i HEAD~2
d ef9c52a removed leading and trailing spaces <-- dropping this commit

And I'll open an issue about making the workflow in this repo more sustainable? (I appreciate it must be frustrating so I do apologise)

@ross-spencer
Copy link
Collaborator

Notes on how to move forward here, if you have time to comment it is appreciated @bitsgalore #10

@bitsgalore
Copy link
Contributor Author

@ross-spencer I just added another commit that reverts the formatting change (couldn't get the rebase thing to work). I agree that the front matter is probably the root of the problem here, it's not really good practice (although I understand why it's used here) and it also seems to confuse several CSV processing tools. Will have a look at the sustainability issue when I have a bit more time.

@ross-spencer
Copy link
Collaborator

Thanks @bitsgalore

@ross-spencer ross-spencer merged commit 137263a into digipres:main Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants