Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export FW to dokieli format #91

Closed
afshinsadeghi opened this issue Jun 26, 2017 · 32 comments
Closed

Export FW to dokieli format #91

afshinsadeghi opened this issue Jun 26, 2017 · 32 comments

Comments

@afshinsadeghi
Copy link
Collaborator

https://github.com/linkeddata/dokieli

@afshinsadeghi
Copy link
Collaborator Author

Process:

  1. Export from FW
  2. Show in dokieli

Learn :
##Structure of FW

##Structure of dokieli

comments in dokeli have two parts.

  1. the marked section of text.

Example:

demonstrating advanced document authoring and interaction without a single point of control💬

structure:
span
mark
sup
/span

tag: mark
id : r-NUMBER
property: "schema:description"
text body

  1. the comment belonging this part of text.

Example:

Sarven Capadisli replies

Authors
Sarven Capadisli
Published
2017-04-15 21:24:58
Rights
CC BY 4.0
Canonical
10c0a57b-1da6-4591-afb5-54bc3bedab87
In reply to (part of)
rchitecture and implementation, demonstrating advanced document authoring and interaction without a single point of control. Such an environment provides t
Rendered via
dokieli

Note

What are you trying to tell me? That I can dodge bullets?
Rights
CC BY 4.0

Structure:
aside class:"note do"
blockquote tag:blockquote cite: URI
article tag:article id: NUMBER typeof="oa:Annotation" prefix:"few URI including the one in cite"
h3 class:shema-name
dl class:author-name
dl class:published
dd
a
time
/time
/a
/dd
/dl
dl class: rights
dl class: canonical
dl class: target
dl class: renderedvia
section id:
h2
div
dl
/section
/article
/blockquote
/aside

@afshinsadeghi
Copy link
Collaborator Author

To add the new code I created the fork of FW here: https://github.com/sadeghiafshin/fiduswriter
@csarven @johanneswilm firstly I want to extend the FW "export to html" function. Does it sound logic?

@afshinsadeghi afshinsadeghi changed the title Connection to dokieli Export FW to dokieli format Jun 29, 2017
@csarven
Copy link
Member

csarven commented Jun 29, 2017

Note that nodes with class "do" are due to being dynamically inserted into DOM. When dokieli does a write operation, it skips over those classes. There is more to the normalisation steps but that may not be necessary to bother with for the time being. For starters, only look at the source HTMLs.

@afshinsadeghi
Copy link
Collaborator Author

@csarven

  1. As far as I got, Dokieli is providing a way to write and publish as web page all written by javascript so that is why it is not centralized. Is that correct? (that is for my learning).

  2. I did not find in Dokieli an import function for a whole page but there exist "import" for JSON-LD and triples. What is that doing?

@csarven
Copy link
Member

csarven commented Jun 30, 2017

  1. Without overloading the terms, yes, to some extent that's true..
  2. At the moment, there is "Open", if that's what you are trying to achieve. I think we can work with this (and improve where necessary). Otherwise, there is no need to "import" as one would just navigate to the URL. Did I understand you correctly? The JSON-LD, Turtle, Nanopublications that you are referring to is to embed data i.e., to enter a block of data that will be injected into the DOM, and when the document is saved/exported, it will be in the HTML (in head).

@afshinsadeghi
Copy link
Collaborator Author

Thanks, then I tested "open". For that, I tried to "open" once a URL from the Internet(https://fiduswriter.gesis.org/document/536/) and an exported Dokieli web page on my computer(file:///Users/afshin/Downloads/dokieli.20170630T111241857Z.html) but it did not worked. Is this is a fully functional button?
I could see iri value in the line 3241 of do.js includes a url of the webpage and getResource function on line 898 is called, but on line 907 the http.open('GET', url); seems not working and this.status on line 916 is empty.
What should be the result of "open" button in the end?

@csarven
Copy link
Member

csarven commented Jun 30, 2017

Possibly due to fiduswriter.gesis.org is not CORS enabled, so XHR status appears to be 0. Let me see if I can update Open a bit to route through a proxy.

Note that https://fiduswriter.gesis.org/document/536/ sends header Location: /account/login/?next=/document/536/, requiring authentication. Perhaps a token in the URL can be used to fetch (for read only)?

IIRC, file: is not yet supported. Only http

@csarven
Copy link
Member

csarven commented Jun 30, 2017

It'd be great if the domain is CORS enabled. At least along the lines of (an Apache configuration)

SetEnvIfNoCase ORIGIN (.*) ORIGIN=$1
Header set Access-Control-Allow-Origin "%{ORIGIN}e" env=ORIGIN

@afshinsadeghi
Copy link
Collaborator Author

afshinsadeghi commented Jun 30, 2017

I will update the apache server . ...

I could not! I am not sudo there. Maybe @johanneswilm can help there.

. Can it be because of "http" check? the FW URL had "httpS"

@csarven
Copy link
Member

csarven commented Jun 30, 2017

I don't think so. It appears to redirect from http to https, and then another redirect for the authentication page.

@csarven
Copy link
Member

csarven commented Jun 30, 2017

I meant that http(s) scheme is supported.

If there is a read-only publicly accessible URL for the article, we should be able to open that in dokieli. I've added proxy use into dokieli in any case, but the public read URL of the article is still needed. Right now I'm looking into dokieli/dokieli#198

@afshinsadeghi
Copy link
Collaborator Author

As a summary: I break this task into two steps.

Export HTML from FW and import to Dokieli.
As we have import HTML in Dokieli but we have CORS problem to import, I install the last version of Dokieli on the dev server(https://fiduswriter-devel.gesis.org) and for the export of FW, I have to extend it to export the documents to include comments and titles etc. Right now I am going through this #92

Currently I imagine that the importing of an HTML document into dokieli works with "OPEN" and continue on the extending export html of FW part.

@csarven How does Dokieli support the proxy use? Should I setup the proxy functionality in a in my test server?

@johanneswilm
Copy link
Collaborator

CORS is generally a problem and AFAIK it's generally recommended not to mess with it. That's why Fidus Writer uses proxy views to download things from other web sites.

@afshinsadeghi
Copy link
Collaborator Author

So I only consider the case that both FW and Dokieli be on the same server.

@johanneswilm
Copy link
Collaborator

CORS is enforced by the user agent (the browser), so to get around this, one can have the server do whatevber needs to be done on the web instead through what we call a "proxy view".

As far as I understood, dokieli is simply a bucnh of files that can be put together in a zip file, right? Do we need to interact with a dokieli server at all? If not, then there shouldn't be any CORS issue.

@johanneswilm
Copy link
Collaborator

johanneswilm commented Jul 3, 2017

Why on the same server? This shouldn't matter, as long as the request to a different server (whereever the dokieli server is running) is done by the proxy call of the fidus writer server and not by the browser.

@afshinsadeghi
Copy link
Collaborator Author

Imho Although it runs on the browser, the browser will check the URL of both and will not let it happen due to CORS problem if they are on different servers.

@csarven
Copy link
Member

csarven commented Jul 3, 2017

I think both exporting FW article to HTML, and importing in dokieli can exist on their own.

For FW, if that particular export function is aligned with dokieli's HTML, the resulting HTML can be used independently of dokieli. Simply publish that as is because that article is "dokieli-ised" any way.

FW can also export an HTML that doesn't include dokieli's CSS and JS (the minimal that's in head in dokieli articles). In this case, that article can be imported ("Open") from dokieli, and as part of that process it will inject the CSS and JS. We can look into the details for the minimal HTML template.. generally it is along the lines below.

<section id="foo" rel="schema:hasPart" resource="#foo">
  <h2 property="schema:name">Foo</h2>
  <div datatype="rdf:HTML" property="schema:description">
    <!-- any HTML -->

    <!-- sub-sections -->
    <section id="bar" rel="schema:hasPart" resource="#bar">
      <h3 property="schema:name">Bar</h3>
      <div datatype="rdf:HTML" property="schema:description">
        <!-- any HTML -->

        <!-- sub-sub-sections -->
        <section id="baz" rel="schema:hasPart" resource="#baz">
          <h4 property="schema:name">Baz</h4>
          <div datatype="rdf:HTML" property="schema:description">
            <!-- any HTML -->

          </div>
        </section>

        <!-- any HTML -->
        <!-- "aside" is a good candidate here as the last node in this section -->
      </div>
      <!-- "aside" is a good candidate here as the last node in this section -->
    </section>

    <!-- any HTML -->

    <section id="qux" rel="schema:hasPart" resource="#qux">
      <h2 property="schema:name">Qux</h2>
      <div datatype="rdf:HTML" property="schema:description">
        <!-- any HTML -->
      </div>
    </section>

    <!-- any HTML -->
  </div>
  <!-- any HTML -->
</section>

The above HTML is not a strict rule. Normally any HTML is okay. The example above only brings some structure and semantics to sections and asides. aside is a good candidate as the last node of a node (section, div), typically used to place footnotes or annotations within.

It'd be nice to have CORS enabled on the FW server, but it is not required. dokieli will try to fetch the input HTTP URL, if CORS is enabled, it'll proceed, otherwise it will use its own proxy URL to fetch again.

I've just created dokieli/dokieli#200 to address the case where an HTML doesn't include dokieli's CSS and JS.

@johanneswilm
Copy link
Collaborator

Hey,
I think you are right that the filter can be reused, and once we do Scholarly html and RASH export filters, we can probably start out by copying the dokieli export filter. However, doing this in two steps seems to only create problems with CORS, and makes it more diffcult for the person running it.

Adding extra files, JS/CSS, etc. isn't really a problem for our exporter system. For theDOCX and ODT exporter filters, the exporter downloads a prexisting xip file from our server that contains all kinds of data that our exporter doesn't need to understand. Our exporter then injects the XML containing the contents of our article and offers it as a download to the user as an ODT/DOCX file. I think the same should be possible here: On the server we store a zip file containing all the standard resoucrs of the dokieli system (JS/CSS), and possibly the outer parts of the HTML file (incl. links to CSS/JS files). Our exporter then onlky needs to walk through the document, create the HTML output to fit the dokieli format, and inject that in the right place in the output file.

What do you think, @csarven? Would it be possible to put everything needed for a dokieli instance in a zip file?

@johanneswilm
Copy link
Collaborator

Alternatively -- is there some toher HTML standard we could export to that dokieli could then import from? It seems like if we have an export filter, ti should be to some standard of some kind -- either dokieli itself or the standard of the dokieli document if that is a thing.

@csarven
Copy link
Member

csarven commented Jul 3, 2017

dokieli is intended to be flexible so that it is not locked into a single HTML template (contrary to other approaches out there). The example HTML template I gave above was only for the purpose of using some of dokieli's features eg building the ToC, having identifiers/semantics for each section/aside etc.. We're trying to capture different HTML+RDFa patterns so that it can be more reusable in those scenarios.

Something like https://dokie.li/new as a shell is completely fine. Or aim your export towards something like https://dokie.li/acm-sigproc-sp , https://dokie.li/lncs-splnproc .

I think the options that jump at me are the following (with increasing amount of work):

  • Have FW's preferred HTML at an URL so dokieli can "open" it, or can be imported ( Open or import local files dokieli/dokieli#200 )
  • Include dokieli's CSS and JS along with FW's preferred HTML in its export
  • Include dokieli's CSS, JS, and reuse its existing HTML patterns in FW's export

If there is a gap in FW/dokieli implementation somewhere, we can try to close that.

@johanneswilm
Copy link
Collaborator

dokieli is intended to be flexible so that it is not locked into a single HTML template (contrary to other approaches out there).

Ok, that makes sense. But I guess neither exporting to RASH nor Scholarly HTML would work for import into Dokieli, or would one of these (or a third standard) be fully "readable" by dokieli?

I think with the options that jump at me are the following (with increasing amount of work):

We have set aside something like two developers over the summer (@sadeghiafshin and one helper starting in a month or so) and the University of Bonn has stopped all other work on OSCOSS to work on just this, so we should do this properly. The consequence of not doing it properly would just be that we won't be able to merge it into FIdus Writer upstream and then it will turn into maintenance hell.

If there is an intermediate HTML format that Dokieli can read, that is standardized and is guaranteed to work, then we can export to that instead. It will not make our exporter for this simpler, but the advantage with exporting to a third-party HTMML would ofc ourse be that we would cover that standard simultaneously.

  • Include dokieli's CSS and JS along with FW's preferred HTML in its export

Ok, but that way would we know for sure that this it is working? FW stores it's data in a standardized JSON format and only serializes it to HTML to show/edit in the browser. We can either reuse that serializer or create our own tree walekr. Creating such a tree walker is fully possible (we have done it for three other formats already).

  • Include dokieli's CSS, JS, and reuse its existing HTML patterns in FW's export

This is what would seem like a proper solution to me. Is there a stable version of dokieli that we could use as a basis for this?

If there is a gap in FW/dokieli implementation somewhere, we can try to close that.

On the Fidus Writer side: most likely. We still do not capture a lot of semantic information, epecially data about the authors. Fixing this in Fidus Writer would be of general interest, but it would likely also be the greatest amount of work, because it will mean that all other parts will need to be expanded upon: other export filters, editor, native file format.

@afshinsadeghi
Copy link
Collaborator Author

I think that will be great if we have a template that is readable by dokieli and will not change soon. For example, a template that supports titles, subtitles, normal text, references and comments.

@johanneswilm
Copy link
Collaborator

@csarven Ok, we have talked about it. It sounds like this is the best solution

Include dokieli's CSS, JS, and reuse its existing HTML patterns in FW's export

Could you prepare such a zip file with the CSS and JS for us or explain to us how to do it? ANd could we have a template for the HTML file with citations, figures, abstract, etc. in the format that dokieli prefers it? Then we will use that as the standard and follow it.

@csarven
Copy link
Member

csarven commented Jul 4, 2017

Sounds good! This is something I'd like to have a documentation for in dokieli as well: dokieli/dokieli#201 .

Will keep you posted.

@csarven
Copy link
Member

csarven commented Jul 16, 2017

Started documentation: https://dokie.li/docs . I'll expand on the design decisions, patterns etc.

@afshinsadeghi
Copy link
Collaborator Author

thanks, @csarven I am waiting for more. Especially the template part. I will start with the template you shared above from next week. will the final version be very different?

@csarven
Copy link
Member

csarven commented Jul 22, 2017

It is in the general direction. I'll get the canonical pattern done next week.

@afshinsadeghi
Copy link
Collaborator Author

continuing with extending the base.js in fw/document/static/js/es6_modules/exporter/html/base.js
adding rdf tags by
jQuery(htmlCode).find()

@csarven
Copy link
Member

csarven commented Aug 11, 2017

Note that dokieli/dokieli#200 is resolved.

@afshinsadeghi
Copy link
Collaborator Author

great. I will update my dev version

@afshinsadeghi
Copy link
Collaborator Author

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants