Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new images and schema-level docs for all eml modules #316

Closed
mbjones opened this issue Oct 24, 2018 · 20 comments
Closed

add new images and schema-level docs for all eml modules #316

mbjones opened this issue Oct 24, 2018 · 20 comments
Assignees
Labels
Milestone

Comments

@mbjones
Copy link
Contributor

@mbjones mbjones commented Oct 24, 2018

No description provided.

@mbjones mbjones added this to the EML2.2.0 milestone Oct 24, 2018
@mbjones mbjones added the next label Oct 24, 2018
@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Oct 30, 2018

I did a bit of testing, to make use of the oxygen editor's documentation generation tool. The generation tool exports images and embedded documentation, for browsing.

Limitations:
Oxygen only exports content of the xs:documentation element, EML uses xs:appinfo with docBook xml inside. So the appinfo content has to be moved to a documentation element, and it will only show plain text there. However,. the HTML rendering will respect newlines, etc, so we can use this to do a bit of formatting.
Checked in stylesheet with these actions:
a) copies an xsd file
b) moves xs:appinfo content to a xs:documentation node, separated into functional paragraphs.
c) uses each elements local-name as the label for that section.

process:
xsltproc eml_appinfo2documentation.xsl ../xsd/eml.xsd > ../tmp/eml.xsd
in OxygenXML editor:
Tools > Generate Documentation > XML Schema Documentation

see attached screenshot
eml_docs_o2_test

@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Oct 31, 2018

things to resolve and finish (if we decide to use Oxygen-generated documentation)

a) might want to copy over to a <xs:documentation> element, (and preserve xs:appinfo).
Plus: You would see the original xs:appinfo as nice XML in the block titled "Source". As is, the text content of xs:documentation is dumped to the screen.
Minus: you'd see the xs:appinfo content 3 times (converted at top, plus raw and original in source)

b) confirm this works for cdata sections. (cannot id a cdata section with xpath, but I think copy tranlates the examples within.)

c) script to run through all xsd files and dump in a temp folder (or some other)

@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Nov 25, 2018

tasks

  • create and populate staging dir for above (pre o2 export)

    • used sibling dir, tmp.
  • run o2 from the command line
    note that O2 uses the parent dir of the schema as parent of the output. so generated docs are in tmp/docs.
    here is the cmd line (run from top of checkout):
    bash-3.2$ /Applications/Oxygen\ XML\ Editor/schemaDocumentation.sh tmp/eml.xsd -out:docs/index.html -format:html -split:namespace

  • build target
    to be honest, its been ages since I've done this and I don't have the java stuff installed anywhere anymore. So I won't be able to test it anyway.
    I think it will be more efficient if one of you java-types does it and and I walk you through the bits and pieces. or you set up the template and I'll fill it out.

  • script to build documentation and deposit in docs/schema

@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Dec 2, 2018

A few notes on element-level docs using Oxygen, from the cmd line.
first, it works. Here is the output from the schemas (dec 1)
http://sbc.lternet.edu/external/InformationManagement/EML_220schema/docs/

I scripted the transform that moves the appinfo to xs:documentation (see bin/). Then ran the command line above (separate manual step).
There are a few options we could employ

  • use a config file to set options (e.g., handling of Source)
  • output location can be set on the command line, or overridden by config file (cmd line does NOT override config). IMO, cmd line is more transparent

Re licensing:

  • it may be that someone can run the O2 doc generator without an oxygen license (depends how you interpret the below, but as all my computers have licenses, I can't test this. I'm probably wrong (its hard to imagine them not wanting a license of some sort) but it's worth a test https://www.oxygenxml.com/oxygen_scripting.html
@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Dec 2, 2018

Checked in:

  • a tmp dir with
    • the transformed xsd that the oxygen needs as input.
    • oxygen config file (sample only, not used)
    • docs dir with the oxygen output html (same as sample you see in sbc.lternet.edu link above)
  • bin/prep_documentation.sh (script that adds x'formed xsds to tmp)
    Of course, tmp files can be dumped, and the xsd prepped and html created from a build
@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Dec 2, 2018

Apparently I have a detached HEAD (eew) and cannot push. will need some advice.

@mbjones

This comment has been minimized.

Copy link
Contributor Author

@mbjones mbjones commented Dec 2, 2018

The files look great, @mobb. I'm glad we're going this direction.

A detached head means that, at some point, you checked out a specific commit, which moves the HEAD to that commit SHA. There are several ways for that to happen, as described on SO. Any commits done during that state will be lost when you checkout another branch.

If you haven't committed, then fixing this is as easy as git checkout BRANCH_EML_2_2 which will return you to that branch, and then you can commit and push. If you have already committed, then you'll likely need to roll back your commit using a soft reset (git reset --soft) to the commit that was checked out. I would figure out which that is using git log --graph. Once your changes are uncommitted, then you can switch to the 2.2 branch as before. The exact command you use will for the reset depends on the specific state of your repository. Be careful with git reset, as it can be used to completely throw away changes that haven't been pushed (although often there are some under the hood ways to even recover those, it requires some serious git-fu). I am still in Geneva so can't chat in real time, but I'll bet @csjx can help get you back to a normal state if needed.

@csjx

This comment has been minimized.

Copy link
Member

@csjx csjx commented Dec 2, 2018

Yes, let me know if you need hand on the git stuff @mobb. And yes, the Oxygen-based images look great. Thanks for working on that!

@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Dec 3, 2018

pushed the prep script and the transformed files up (in tmp dir). I did not check in the Oxygen output (html and img) mainly because the temp/docs dir is large (170 mb), and if the plan is to generate them with a build, there is no need to store output in the repo.

So I think what we really need next is the build target (and test if script can run without a license) and I'd like one of you java-folk to set up that up. @csjx ?

@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Jan 11, 2019

the script is done (7c60bc3)
and generates ~40 files and 700 images
The output is not checked in yet because one file is bigger than the git-limit of 100 MB. it's the schHierarchy.html -- basically the ToC
One option is git-large-file-storage, but we'd have to enable it for the repo, and contributors need the plugin too. so we may want to talk first.
Another would be some other splitting of the HTML, maybe manual. But we should see how it fits with the other documentation first.

@mobb mobb added the documentation label Jan 11, 2019
@mobb mobb changed the title add new images for all eml modules add new images and schema-level docs for all eml modules Jan 11, 2019
@amoeba

This comment has been minimized.

Copy link
Contributor

@amoeba amoeba commented Jan 24, 2019

@mobb, @stevenchong, and I chatted this morning about this and I offered to take a look. I think this huge 133mb-ish file isn't necessary, it's just a huge hierarchy and it takes forever to load on my machine anyway so I doubt we wanna serve it anywhere. With this file removed, the rest of the docs work and make sense.

I want to bring up something that I could use some feedback on. With the new approach based on Oxygen, the schema diagrams aren't quite as complete. That is, for a given schema diagram, Oxygen doesn't include quite as many of the child elements as old ones so the individual images are less useful. I re-ran the images and am using GitHub's diff view to show the before and after. Make sure to click "Show rich diffs" to see the diffs:

screen shot 2019-01-23 at 4 16 20 pm

The diff -> amoeba/EML@a03651d?diff=split&short_path=4aacebb#diff-4aacebb6469314af80f19eb79771334b

Are we okay with those differences? I and others find the current diagrams really useful but if they were less "full" they'd be less useful.

While the above is a bit of an issue, Oxygen does a really nice, interactive site which I think is really nice. See a live version here. Maybe this outweighs the changes to the diagrams?

Last thing: Do we wanna ditch the current HTML docs (generated with ant build) in favor of this new site, or should we integrate the new images into the ant build? The main difference I see is that the Oxygen docs don't have all the prose we wrote into the schema diagrams. i.e., we don't get a really nice intro like https://knb.ecoinformatics.org/external//emlparser/docs/eml-2.1.1/index.html. But maybe this is moot because we're switching to Bookdown?

@csjx

This comment has been minimized.

Copy link
Member

@csjx csjx commented Jan 24, 2019

My understanding is that is moot in that the Bookdown version of the docs will provide all of the summary prose, whereas the Oxygen-generated documentation will provide the technical details. Integrating the two will require some finesse.

@mobb

This comment has been minimized.

Copy link
Contributor

@mobb mobb commented Jan 24, 2019

Re @amoeba 's last comment:

we don't get a really nice intro like https://knb.ecoinformatics.org/external//emlparser/docs/eml-2.1.1/index.html.

It's a two step process. O2 only will process xs:appinfo to html, and we are using doc:documentation. The script I wrote has 2 steps, the first one does the transform that moves appinfo to doc. second step is to run the O2 generator. see
https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/bin/build_schema_documentation.sh
Sorry that was not clear (where do we keep documentation-documentation?)

I also left in the script that does only the transform:
https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/bin/prep_documentation.sh

what were your changes to the O2 command line?

@mbjones

This comment has been minimized.

Copy link
Contributor Author

@mbjones mbjones commented Jan 24, 2019

Yeah, the bookdown is intended to replace the prose, and the oxygen is meant to provide the schema-level docs. In looking at the image diffs, it seems to me that the main difference is that the oxygen images just don't expand some of the child nodes that we did manually when we generated the original images. However, if we create the interactive, expandable tree via oxygen in HTML, then we may not need the images as static links at all. I'm imagining the book would simply include a chapter that inline the schema docs that come out of oxygen so they can be browsed and searched.

@amoeba

This comment has been minimized.

Copy link
Contributor

@amoeba amoeba commented Jan 24, 2019

Ah, @mobb I read that stuff but it didn't make sense as I didn't know anything about appinfo vs. documentation and the script doesn't fully run for me because of a bug I'll fix shortly.

"I'm imagining the book would simply include a chapter that inline the schema docs..."

Are you thinking via an iframe or something?

@amoeba

This comment has been minimized.

Copy link
Contributor

@amoeba amoeba commented Jan 25, 2019

Just made some tweaks to your script, @mobb. Can you take a look, re-run it, and, if it made the fixes you wanted, commit it?

@amoeba

This comment has been minimized.

Copy link
Contributor

@amoeba amoeba commented Jan 25, 2019

After talking with the EML team and also the data team, we decided a good route to go down would be to:

  1. Keep the nice module-level diagrams from earlier versions of EML (in the img folder) in a similar form as the images are useful in their own right. We use(d) XMLSpy to generate them before and Oxygen's diagrams aren't quite as complete in terms of how much of the sub-tree they display as XMLSpy's. Part of the reason to keep the module level docs is because a few folks have expressed interest in them because they're just a single image that you can glance at quickly and it's all on your screen at the same time.
  2. Also use Oxygen to create the really useful, stand-alone documentation site because we found we liked this as an additional EML documentation resource
  3. Use the images from (1) in the Bookdown documentation.

This type of approach would match a lot of software projects where there's a end-user focused guide (Bookdown) + a developer focused API doc sites (Oxygen). I think this is all in line with the spirit of the discussions in this Issue and outside of it.

New images were added in 04cf0d5. Check the rich diffs to see the side-by-side comparison. I took some liberty in adding a few extra images (and we can generate any more we need). I also optimized the PNGs and saved about 30% on each file.

@mbjones mbjones added in progress and removed next labels Jan 31, 2019
@mbjones mbjones removed the in progress label Jul 25, 2019
@mbjones

This comment has been minimized.

Copy link
Contributor Author

@mbjones mbjones commented Aug 18, 2019

RIght now we have the module level docs in the bookdown in an iframe. Let's discuss whether we also want to embed the independent images, and if so, where they would go. But for now I don't think we should hold up the EML 2.2.0 release for this ticket.

@amoeba

This comment has been minimized.

Copy link
Contributor

@amoeba amoeba commented Aug 29, 2019

We talked in Slack just now about this and decided I'd make some minor tweaks to the module-level images, include them in the docs as static images, and link elsewhere to the Oxygen docs site. Working on that now.

amoeba added a commit that referenced this issue Aug 30, 2019
Addresses #316
@mbjones

This comment has been minimized.

Copy link
Contributor Author

@mbjones mbjones commented Aug 31, 2019

THanks, @amoeba -- your image links look good, and this is now incorporated in the build.

@mbjones mbjones closed this Aug 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.