Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Prismic V2 format #17

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jaen
Copy link

@jaen jaen commented Feb 1, 2024

Hello,

On behalf of Prismic, I come bearing this PR adding support for the new rich text JSON format used the V2 API (e.g. the Migration API). As an outside contractor who's rather fresh to the team, I can't claim I haven't missed some format subtlety, but I've tried my best not to and the updated tool's output was also verified internally. None the less, please also double-check on your end, if you are able.

I think this should be mostly ready for review – there's only an outstanding question of the CLI interface.

Currently, the tool takes the first CLI argument as the text to convert, which is a bit of a weird choice (as you have to do "$(cat some/file)" to take input from a file), but generally works. But, adding the V2 format and not wanting to break the existing V1 functionality, I have used optparse to add a --format v2 switch, which had an unexpected side-effect of having dashes in the text-to-convert confusing the parser into thinking they are arguments, thus forcing usage of -- to separate the text from the rest of arguments.

Given that adding format choice parameter can already break compatibility with existing usages of the tool, what do you think about leaving the existing CLI entry points in bin as-is and instead add new one, say prismic-convert (or kramdown-prismic or whatever, name to bikeshed) that has a more convetional interface instead, for example:

# You can now specify input and output format (e.g. to upgrade your existing JSONs)
# Also the parameter is now, more conventionally, a file path, instead of inline document string
prismic-convert --input-format=prismic-v1 --output-format=prismic-v2 some/file/path.json
# You can use stdin to pass documents
cat some/file/path.json | prismic-convert -i markdown -o prismic-v2
# If you want to replicate specifying documents inline in bash, `echo` or a heredoc are always there for you
# The output would default to the new Prismic V2 - input we could try to infer from extension, or ask to be always specified
echo "<strong>HULK SMASH</strong>" | prismic-convert -i html

What do you think about that?

There's also a few notes that might help you whjen reviewing:

  • this was tested locally on Ruby 2.6, so should still maintain Ruby version compatibility,
  • the implementation of the new format is basically done to/from (in the input/output direction, respectively) as not to refactor the converters/parsers too much unnecessarily,
  • I have added tests for the new format - mostly by copying existing tests, but I have also added a few tests that compare a sample document between the formats. Now that I think about it, I have forgotten to tests FormatUtils::V2 itself, I can add that if you wish,
  • there seem to be certain inconsistencies in whitespace handling of the tool - kramdown doesn't seem to normalise whitespace when reading markdown - that came up when testing the full document. I tried to see if I can fix that, but it ended up being a bit of a rabbit hole and given the tool already had this behaviour I gave up (and normalised that for comparison in the test instead, with appropriate comment) - not sure what you think the solution should be here,
  • I have normalised the order of certain attributes as well as spans (the tool seems to currently return them in depth-first post-order traversal order, while Prismic API seems to first sort by type, then by span start) to match what Prismic API seems to return, again for easier comparison - if you think that's better dealt with in tests only without affecting the output, I can change that,
  • turns out convert_a can be called not only for images - for example toplevel anchors also trigger that method and the CLI blows up. I have not fixed it, but I can if you want,
  • it seems that Prismic V2 API doesn't support linkTo the way you use it - I haven't added any warnings about it, nor changed code to remove it, should I,
  • I usually prefer to use "string" all the time and {:a => b} over {a: b} – I have tried to match what was already there, but might have slipped to my default unwittingly and it seems there's no linting set up, so let me know if I messed up.

I think that's all that comes to my mind now, hopefully the PR's not too terrible.

@francois2metz
Copy link
Collaborator

Hi! I'll try to take a look at it next week at best.

@afska
Copy link

afska commented Feb 28, 2024

Hey @jaen, I've tried using the markdown2prismic script from this branch to convert an example, and put that in one of my slice's mock content to check if everything is working correctly in Slice Simulator. I found two problems:

1) Links are not rendered

1
(a fork should be a link)

This tool returns a JSON slightly different to what Slice Simulator does. Example link from kramdown-prismic:

{
    "content": {
        "spans": [
            {
                "end": 61,
                "start": 53,
                "type": "em"
            },
            {
                "data": {
                    "url": "https://github.com/prismicio/kramdown-prismic"
                },
                "end": 82,
                "start": 76,
                "type": "hyperlink"
            },
            {
                "end": 16,
                "start": 8,
                "type": "strong"
            }
        ],
        "text": "This is markdown that will be converted to Prismic's RichText format, using a fork of the kramdown-prismic ruby gem."
    },
    "type": "paragraph"
}

where Slice Simulator generates:

{
		"type": "paragraph",
		"content": {
			"text": "This is markdown that will be converted to Prismic's RichText format, using a fork of the kramdown-prismic ruby gem.",
			"spans": [
				{
					"type": "em",
					"start": 53,
					"end": 61
				},
				{
					"type": "hyperlink",
					"start": 76,
					"end": 82,
					"data": {
						"__TYPE__": "ExternalLink",
						"url": "https://github.com/prismicio/kramdown-prismic",
						"target": "_self"
					}
				},
				{
					"type": "strong",
					"start": 8,
					"end": 16
				}
			]
		},
		"direction": "ltr"
	}

The only missing keys are __TYPE__ and target, adding that manually makes everything work fine.

2) When using images, Slice Simulator fails to render the whole field. It shows the field as empty:

image

The output from the script was:

{
            "content": {
              "spans": [
              ],
              "text": ""
            },
            "data": {
              "alt": "un alt",
              "origin": {
                "url": "https://images.unsplash.com/photo-1546548970-71785318a17b?crop=entropy&cs=srgb&fm=jpg&ixid=M3wzMzc0NjN8MHwxfHNlYXJjaHwxfHxjaXRydXMlMjBmcnVpdHxlbnwwfHx8fDE3MDkwODAzMzd8MA&ixlib=rb-4.0.3&q=85"
              }
            },
            "type": "image"
          }

while Slice Simulator generates:

{
						"type": "image",
						"data": {
							"__TYPE__": "ImageContent",
							"url": "https://images.unsplash.com/photo-1546548970-71785318a17b?crop=entropy&cs=srgb&fm=jpg&ixid=M3wzMzc0NjN8MHwxfHNlYXJjaHwxfHxjaXRydXMlMjBmcnVpdHxlbnwwfHx8fDE3MDkwODAzMzd8MA&ixlib=rb-4.0.3&q=85",
							"alt": "flat lay photography of sliced pomegranate, lime, and lemon",
							"origin": {
								"id": "7r1HxvVC7AY",
								"url": "https://images.unsplash.com/photo-1546548970-71785318a17b?crop=entropy&cs=srgb&fm=jpg&ixid=M3wzMzc0NjN8MHwxfHNlYXJjaHwxfHxjaXRydXMlMjBmcnVpdHxlbnwwfHx8fDE3MDkwODAzMzd8MA&ixlib=rb-4.0.3&q=85",
								"width": 3104,
								"height": 4656
							},
							"width": 3104,
							"height": 4656,
							"edit": {
								"background": "transparent",
								"zoom": 1,
								"crop": {
									"x": 0,
									"y": 0
								}
							}
						}
					}

3) Horizontal lines and HTML tags are ignored

Is there a way to workaround this?


These things can be only a problem with the simulator itself and maybe the format is fine, but since we can't use the API, I think it's the only wait for us to test it.

@jaen
Copy link
Author

jaen commented Feb 28, 2024

@afska as far as I understand the Migration API and the editor use a somewhat different format, so it might be the source of issues – that said, if you could give me the documents you use to test this (markdown, I assume) I can try testing this on my end and fix any problems that arise.

Horizontal lines and HTML tags are ignored

I have only maintained parity with what the tool does right now, and it ignores most of the tags.

This is probably related to the fact, that Prismic's rich text seems to have a limited selection of elements – as far as I can see there's only equivalents of <b>, <i>, <p>, <h1> to <h6>, <ul>, <ol>, <pre>, <img> and OpenGraph embeds available in the editor pallette. So there's no way to represent <hr> or other tags in Prismic rich text and they are either ignored or represented as text only (I don't remember off-hand if this is consistent, maybe we should just turn unknown elements into their text content consistently).

@afska
Copy link

afska commented Feb 29, 2024

@jaen ah, I see, thx. So, there's no way to display custom HTML in Prismic? I'm working on a migration that contains multiple markdown files, and we don't know whether they use inline HTML tags inside the markdown, so I was hoping there was a way to migrate these files as close to 1:1 as possible.

Here's the markdown file I used to test.

@jaen
Copy link
Author

jaen commented Feb 29, 2024

@afska it seems like it to me – but keep in mind as a fresh contractor I'm nowhere near being an authoritative source, customer support forum (https://community.prismic.io/) would be a better bet.

Thanks for the markdown file, I'll take a look at it when I have a moment.

@jaen
Copy link
Author

jaen commented Mar 1, 2024

@afska I was told you're in touch with someone from support, so if there's anything else that comes up that's not PR-related, I hope you won't mind if I'll respond via the support person as not to derail the PR.

But since I'm already posting a comment:

  • it seems there is a way to use raw HTML, see this post,
  • for the image, it seems that you can only use things that are already uploaded as an asset via the Media Library, not an arbitrary image link, and you have to specify their id as an attribute,
  • for the link, the JSON seems fine at a first glance, maybe there's something wrong with the preview – like I said I'll investigate it when I have time and either fix or forward it to the appropriate team, depending on what's the underlying issue.

@afska
Copy link

afska commented Mar 4, 2024

EDIT: Dismiss this message. I wasn't testing correctly.

Hey @jaen, thanks for your responses. I returned with more feedback! (this time more related to the PR)

So I tried the Migration API and yeah, it works pretty well. I didn't face any of the issues I commented before.

It seems though that your code outputs objects like { "type": "paragraph", "content": { "text": "...", "spans": [] } } and the API expects type, text, and span at the same level.

image

Another issue is that hyperlinks need to contain the key link_type: "Web", otherwise the API will throw an error.

We implemented this workaround to address the issues and wanted to ensure you were aware of these adjustments.

_adapt(richText) {
  return richText.map((part) => {
    if (part.content) part = { ...part, ...part.content, content: undefined };
    if (part.spans) {
      for (let span of part.spans) {
        if (span.type === "hyperlink" && span.data != null)
          span.data.link_type = "Web";
      }
    }

    return part;
  });
}

@jaen
Copy link
Author

jaen commented Mar 20, 2024

@afska hey, sorry for not getting back earlier, but I had some other tasks to focus on.

Anyway, I can't seem to reproduce what you described, when calling it like this:

markdown2prismic --format v2 -- "$(cat test/fixtures/markdown.md)"

Note the --format v2 (described in the ~third paragraph of PR description), without it the tool will fallback to the V1 format (which has the features you desrcribe). Are you sure you're passing this parameter?

@afska
Copy link

afska commented Mar 20, 2024

@jaen ah, that's probably it then, sorry. I was not sending the format flag.

The `:origin` field of the image was not being converted according to the API V2
representation. This is now corrected.
@jaen
Copy link
Author

jaen commented Apr 15, 2024

When writing example code for the Migration API I came across a slight inaccuracy in the V2 format with the origin field on image (it should've been flattened, but wasn't), it should now be fixed.

@francois2metz FYI I was given to understand that Prismic's import/export feature is planned to go away by the end of this month. As such it would probably be useful if there's an official version of this gem supporting API V2. So if you could spare some time for a review, it would be appreciated

It seems that there's a specific_install RubyGems plugin that allows easily installing a gem from a git branch and it's what we will probably recommend in the mean time.

@aadtyaraj01
Copy link

I'm getting an error when converting HTML to prismic JSON:
html2prismic --format v2 '

what up


H for Hamerica

'
< was unexpected at this time.
image

@aadtyaraj01
Copy link

I'm getting an error when converting HTML to prismic JSON: html2prismic --format v2 '

what up

H for Hamerica

'
< was unexpected at this time.
image

Works fine when HTML string is in double-quotes. I was using the example from the readme file which was in single quotes.

Anyway, awesome work.

@seppviljar
Copy link

Hey, @jaen - didn't want to add too much info here but could you check this issue #18 - problems with V2 prismic2markdown

@jaen
Copy link
Author

jaen commented Apr 25, 2024

@aadtyaraj01 yeah, it was an issue in how you called the script in your shell, not the script itself. But I agree that the current interface – passing the script as a parameter, not as a file path – is kind of awkward and I think it could be improved. But ultimately up to @francois2metz how he wants the CLI to look (and maybe would be a separate PR anyway)>

@seppviljar weird – I'll look at it as soon as I can and I'll respond when I understand the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants