Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTML input support #313

Merged
merged 7 commits into from Feb 12, 2020
Merged

Add HTML input support #313

merged 7 commits into from Feb 12, 2020

Conversation

@fsteeg
Copy link
Member

fsteeg commented Jan 23, 2020

Parse HTML with jsoup, write XML. See example in test.

See #312

Opening as draft pull request for some initial discussion, I suggest we only merge when we successfully used this in our full use case. In particular, config options and output format are to be determined.

@dr0i: I've set this up as a separate Gradle project, mostly because it adds the jsoup dependency, and because it fits with the overall structure that we have. What do you think?

fsteeg added 7 commits Jan 23, 2020
Parse HTML with jsoup, write XML. See example in test.

See #312
To use with decode-xml, but how to test?

See #312
With `decode-html` flux command

See #312
Set generated record ID, only process content of leaf nodes

See #312
See #312
@fsteeg fsteeg changed the title Basic HtmlReader with `html-to-xml` flux command Add HTML input support Feb 7, 2020
@fsteeg

This comment has been minimized.

Copy link
Member Author

fsteeg commented Feb 7, 2020

With the (functionally reviewed) scenarios in hbz/oerindex#2 and hbz/oerindex#3, this could now resolve #312. It also contains 47a5ba7, which resolves #314.

@fsteeg fsteeg marked this pull request as ready for review Feb 7, 2020
@fsteeg fsteeg requested a review from dr0i Feb 7, 2020
@dr0i
dr0i approved these changes Feb 11, 2020
@fsteeg fsteeg merged commit 3c9998c into master Feb 12, 2020
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.