Version 3.0 #27

nilshoerrmann · 2013-09-20T09:40:08Z

This feature branch should prepare version 3.0 of XML Importer. It should fix some of the extension's major issues – I just need to convince @brendo to work on this with me :)

Features

API – @brendo

change XML import from direct sources to Symphony Data Sources
make sure all validation errors a stored and prompted to the user
if the import failed, only show entries with errors and not all entries (also reflect this in the error message which currently states 1234 entries of 1234 entries did not validate although it might just have been 1 or 2)
if the import failed, allow the user to ignore errors and only import the valid entries
old importer files should not be rendered in the interface and display a message that they have been created with an older version
data source selection: provide xpath to /data/name/entry as data attributes in select box options
if an entry is invalid, provide the line number of the failing node in the error meta data

UI – @nilshoerrmann

change import UI from direct links to Data Sources
make the error page easier to read, giving better feedback
~~fix duplicator integration with XPath and focus being stolen~~ (when does this happen actually?)

nilshoerrmann · 2013-09-29T16:47:53Z

@brendo: Two questions:

Is the remote data source interface so flexible that we could include it inside the importer's interface? I'm thinking of an on demand inclusion where the user can choose between an existing data source or a new one. This would only be interesting, if it can be done with a few lines of code.
What about pre-populating the xPath expression depending on the data source? If the user selects a section data source of the name Blog: All Entries we know that the path to the entries – which will most likely be used – will be /data/blog-all-entries/entry. If we include the path as a data attribute of the select box with all data sources, we could change the path dynamically when the user switches sources.

nilshoerrmann · 2013-09-29T17:07:21Z

@brendo, I just updated the to do list in the initial post.

nilshoerrmann · 2013-09-29T17:14:10Z

Regarding the success and error page after running an importer: My idea is to display the entries in an index table, filtered by status (created, updated, skipped, failed). This should make it easier to scan the results – especially when looking for errors. There is only one problem that needs to be solved: when debugging error, were could the input array and the resulting XML be presented?

Should there be some kind of debug view, like the frontend debugger that the user can switch to?

From my experience, common errors are missing required fields or invalid XML. The latter will now result in a general data source error, so nothing we'll have to take care of in the importer interface itself.

Do we actually need to display the raw data?
(It currently does not display multidimensional arrays, so it's broken anyway.)

brendo · 2013-09-29T22:04:33Z

Is the remote data source interface so flexible that we could include it inside the importer's interface? I'm thinking of an on demand inclusion where the user can choose between an existing data source or a new one. This would only be interesting, if it can be done with a few lines of code.

Not sure, it relies on the markup and assets that live on the DSE page, probably one we'd just have to try to be honest. I'm not overly keen on this, at the moment we're supporting all DataSources :)

What about pre-populating the xPath expression depending on the data source? If the user selects a section data source of the name Blog: All Entries we know that the path to the entries – which will most likely be used – will be /data/blog-all-entries/entry. If we include the path as a data attribute of the select box with all data sources, we could change the path dynamically when the user switches sources.

This is definitely doable, but it'd lose usefulness for Dynamic/Remote DS's (as the XML may not have the /entry structure.

Regarding the success and error page after running an importer: My idea is to display the entries in an index table, filtered by status (created, updated, skipped, failed). This should make it easier to scan the results – especially when looking for errors. There is only one problem that needs to be solved: when debugging error, were could the input array and the resulting XML be presented?

Not sure what you're after here? All the entries are available in $this->_entries. Each entry has a status which is created, failed, skipped, updated. The result can be displayed on the 'success' page?

Should there be some kind of debug view, like the frontend debugger that the user can switch to?

I don't think we'll get there for V3. AFAIK we can't seed the Debug Devkit with XML, and attempting to embed/replicate Debug Devkit will be a difficult exercise.

Do we actually need to display the raw data?
(It currently does not display multidimensional arrays, so it's broken anyway.)

Yes, because there's no way to debug exactly what your 'Remote' is returning. It should display multidimensional, that's just a bug :)

nilshoerrmann · 2013-09-30T04:24:48Z

Not sure, it relies on the markup and assets that live on the DSE page, probably one we'd just have to try to be honest. I'm not overly keen on this, at the moment we're supporting all DataSources :)

Right, scratch that.

This is definitely doable, but it'd lose usefulness for Dynamic/Remote DS's (as the XML may not have the /entry structure.

But even for remote data sources we know the first two levels: /data/name-of-ds.

Not sure what you're after here? … AFAIK we can't seed the Debug Devkit with XML …

Oh, I don't want to make use of the devkits. This was more a rambling about the visual and structural representation of succeeded and failed entries.

brendo · 2013-09-30T13:01:52Z

The XPath hint can already be derived from the value of the option, so an extra data attribute is superfluous :)

brendo · 2013-09-30T13:04:55Z

With regards to the line number, what is the line number relative to? In the debug devkit we have the params output at the top of the file, so that will skew the line numbers reported in XML Importer compared to the Devkit.

It sounds like you want to embed the entire XML somewhere so that you can link directly through to the problematic XML whilst in the backend.

brendo · 2013-09-30T13:12:17Z

And finally, we don't have any mechanisms to determine what version of the XML Importer created the file, so blocking the interface will need a little thought too :) The source is still going to be a string, the only thing I can think of is see if the string matches the datasource handles available. If it does, it's a new datasource, if it's not, then it's an old one.

I've left enough legacy code in the XML Importer so old importers should still work, but as you say, the interface won't be able to handle them.

Blocking will need some thought:

Prevent the user from clicking through to the editor from the index
Remove the save/delete buttons from the interface
Remove the Edit XMLImporter link from the Run profile
Just hide the Source area (make it readonly) and allow users to update XPath of older files?

brendo · 2013-09-30T13:16:08Z

The duplicator oddity:

Create a Section with a single field (I used Text Input)
Create a Datasource that points to that new section
Create an XMLImporter, link it to that new Datasource
Use the duplicator to add XPath for that single field
Press save, and then scroll down the page
The instance and template for the single field both appear. I am now unable to save the Datasource, or remove the XPath for the field

nilshoerrmann · 2013-09-30T14:31:48Z

The XPath hint can already be derived from the value of the option, so an extra data attribute is superfluous :)

Ähm, how that?

nilshoerrmann · 2013-09-30T14:34:00Z

It sounds like you want to embed the entire XML somewhere so that you can link directly through to the problematic XML whilst in the backend.

Well, if an error occurs, XML Importer currently displays the related XML node and the resulting PHP Array. My idea was to give the user some context where this node can be found in the original source.

nilshoerrmann · 2013-09-30T14:41:30Z

And finally, we don't have any mechanisms to determine what version of the XML Importer created the file, so blocking the interface will need a little thought too :) The source is still going to be a string, the only thing I can think of is see if the string matches the datasource handles available. If it does, it's a new datasource, if it's not, then it's an old one.

Shouldn't all old resources either start with http or {$root}?

Blocking will need some thought:

Prevent the user from clicking through to the editor from the index

Remove the save/delete buttons from the interface

Remove the Edit XMLImporter link from the Run profile

Just hide the Source area (make it readonly) and allow users to update XPath of older files?

I wouldn't do anything fancy and just display a message in the edit view – similar to how the DS editor works when allowEditorToParse() is set to false. And I wouldn't adjust the rest of the interface (buttons et al.).

nilshoerrmann · 2013-09-30T14:48:17Z

Custom Helpers

There is one idea I forgot to mention in our chat yesterday: Now that we switched to a more standard approach to fetch data using Symphony data sources, why don't we do the same to format data? We have those custom, static PHP helpers that cannot be shared easily but we have text formatter that do a similar task. Why don't we replace XMLImporterHelpers with text formatters?

Something like Markdownify can be helpful in other contexts and text formatters are a core concept already.

brendo · 2013-09-30T21:27:38Z

The XPath hint can already be derived from the value of the option, so an extra data attribute is superfluous :)

Ähm, how that?

The option is currently <option value='ds-handle'>Datasource Name</option>, so adding a data attribute that is xpath-hint='/data/ds-handle/entry' is redundant since we already know about /data (or can just use //) and /entry is also known (but risky considering Dynamic/Remote may not have entry nodes).

It sounds like you want to embed the entire XML somewhere so that you can link directly through to the problematic XML whilst in the backend.

Well, if an error occurs, XML Importer currently displays the related XML node and the resulting PHP Array. My idea was to give the user some context where this node can be found in the original source.

I'll have a look at what we can do :)

And finally, we don't have any mechanisms to determine what version of the XML Importer created the file, so blocking the interface will need a little thought too :) The source is still going to be a string, the only thing I can think of is see if the string matches the datasource handles available. If it does, it's a new datasource, if it's not, then it's an old one.

Shouldn't all old resources either start with http or {$root}?

Come to think of it, yes :) If not all, I'm sure it's a high majority.

I wouldn't do anything fancy and just display a message in the edit view – similar to how the DS editor works when allowEditorToParse() is set to false. And I wouldn't adjust the rest of the interface (buttons et al.).

At the moment there's a completely different state for that scenario though, we don't display any of the editor it's just a read only page with information from the about() array. Allowing a user to save this XMLImporter can ruin it (as the source would be lost).

There is one idea I forgot to mention in our chat yesterday: Now that we switched to a more standard approach to fetch data using Symphony data sources, why don't we do the same to format data? We have those custom, static PHP helpers that cannot be shared easily but we have text formatter that do a similar task. Why don't we replace XMLImporterHelpers with text formatters?

This is assuming that helpers always do something like 'format' though, which isn't always the case. I agree though, at the moment they are inflexible. What if we just made them live in /workspace/ so they can be shared through projects (I thought I'd done this already, but must of been a dream!)

nilshoerrmann · 2014-06-23T12:35:48Z

Brendan, I've made a few update to the datasource branch but the branch is out of sync with integration and master (all three have different commit histories). Could you have a look what needs to be merged from master and integration?

I've tested this branch over the last days and I have to say that importing has never been easier and smoother. It's a big improvement to use Data Sources here as it makes debugging invalid sources a lot easier.

…mport the valid ones

…h obtaining the source from DataSources

…unt for the imported data as well, not just the existing data

brendo · 2014-09-28T07:07:48Z

This is totally rebased with integration's latest.

brendo · 2014-09-28T07:18:35Z

@nilshoerrmann You are free to carry on with testing :)

nilshoerrmann · 2014-09-29T08:01:57Z

Very cool, thanks Brendan!
I'll try to integrate the changes in a project next week and will report back.

jonmifsud · 2014-12-06T17:23:31Z

Hi Guys what's the status of this at the moment? Also a quick question - but I guess the XML importer latest version it wouldn't be possible to import entries with pre-set Entry ID's right?

brendo · 2014-12-17T11:44:19Z

@jonmifsud Hey Jon! I'll email you as well, but I've just merged the datasources work into the integration branch and removed the datasources branch. The next version of XML Importer will be this integration branch which has a new approach by abstracting the retrieval of data to Data Sources.

Also a quick question - but I guess the XML importer latest version it wouldn't be possible to import entries with pre-set Entry ID's right?

No, it doesn't, but you're welcome to work on this feature and submit against the integration branch. I imagine it would be very helpful for managing complex relationships through import/export. Let me know how you go though, I have a feeling there is some checks in EntryManager::add that is not going to make this easy.

Oh, also, maybe this discussion should continue in #19

jonmifsud · 2014-12-17T12:12:51Z

@brendo thanks for the information, If thins go to plan I should be trying this out next week.

nilshoerrmann mentioned this pull request Sep 29, 2013

CSV Sources symphonycms/remote_datasource#11

Closed

michael-e mentioned this pull request Oct 4, 2013

Number of entries not displayed symphonists/publishfiltering#39

Closed

nilshoerrmann mentioned this pull request Apr 26, 2014

Duplicator issue: XML Importer symphonycms/symphonycms#2051

Closed

This was referenced Jul 4, 2014

Content Types #24

Closed

Add the ability to parse URL parameters #4

Closed

Nils Hörrmann and others added 10 commits September 28, 2014 16:44

Prepare version 3.0

d118b20

Add partially import support, which will ignore invalid entries and i…

d2c04b3

…mport the valid ones

Remove the ability to load source from a URL (in the UI). Replace wit…

9223228

…h obtaining the source from DataSources

Handle {} as a URL. Add XPath errors inline

33ff10a

Handle when the linked DS has namespaces

9f64088

Fix error when using the unique field & updating existing entries

8f84114

Modify the edit/skip logic so that duplicates will be taken into acco…

9f1d8ba

…unt for the imported data as well, not just the existing data

Add debugging for when errors occur when setting values

926ceac

Small fixes

6acc035

Format template

b7e7e47

nilshoerrmann and others added 7 commits September 28, 2014 16:53

Make sure XML structure matches the one from the frontend

cb4574e

Make importer sources more readable

c3fe499

Suggest xPath

51e8615

Display Data Source name

cf32625

Update readme

6207c85

Updated meta xml

0df578f

Fix some rebasing issues!

39357cc

brendo force-pushed the datasources branch from dfa1e0b to 39357cc Compare September 28, 2014 07:09

Add a version to the template so the new GUI does not load old Importers

3dd5a13

Strict standards fixes

dcfb14d

brendo closed this Dec 17, 2014

brendo deleted the datasources branch December 17, 2014 11:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 3.0 #27

Version 3.0 #27

nilshoerrmann commented Sep 20, 2013

nilshoerrmann commented Sep 29, 2013

nilshoerrmann commented Sep 29, 2013

nilshoerrmann commented Sep 29, 2013

brendo commented Sep 29, 2013

nilshoerrmann commented Sep 30, 2013

brendo commented Sep 30, 2013

brendo commented Sep 30, 2013

brendo commented Sep 30, 2013

brendo commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

brendo commented Sep 30, 2013

nilshoerrmann commented Jun 23, 2014

brendo commented Sep 28, 2014

brendo commented Sep 28, 2014

nilshoerrmann commented Sep 29, 2014

jonmifsud commented Dec 6, 2014

brendo commented Dec 17, 2014

jonmifsud commented Dec 17, 2014

Version 3.0 #27

Version 3.0 #27

Conversation

nilshoerrmann commented Sep 20, 2013

Features

API – @brendo

UI – @nilshoerrmann

nilshoerrmann commented Sep 29, 2013

nilshoerrmann commented Sep 29, 2013

nilshoerrmann commented Sep 29, 2013

brendo commented Sep 29, 2013

nilshoerrmann commented Sep 30, 2013

brendo commented Sep 30, 2013

brendo commented Sep 30, 2013

brendo commented Sep 30, 2013

brendo commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

nilshoerrmann commented Sep 30, 2013

Custom Helpers

brendo commented Sep 30, 2013

nilshoerrmann commented Jun 23, 2014

brendo commented Sep 28, 2014

brendo commented Sep 28, 2014

nilshoerrmann commented Sep 29, 2014

jonmifsud commented Dec 6, 2014

brendo commented Dec 17, 2014

jonmifsud commented Dec 17, 2014