Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV datasource not converted from 3.16 to 3.22 correctly - projects not usable anymore #48587

Closed
2 tasks done
mlechner opened this issue May 17, 2022 · 31 comments · Fixed by #51881
Closed
2 tasks done
Assignees
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Crash/Data Corruption Data Source Manager Feedback Waiting on the submitter for answers Project Windows Related to Windows operating system

Comments

@mlechner
Copy link

What is the bug or the crash?

In a project created with QGIS 3.16.x the datasource of a CSV layer is give like:
<datasource>./aef/2und3k - Kopie.aef?type=csv&amp;delimiter=\t&amp;skipLines=115&amp;skipEmptyFields=Yes&amp;maxFields=10000&amp;detectTypes=yes&amp;decimalPoint=,&amp;xField=Center East&amp;yField=Center North&amp;crs=EPSG:32636&amp;spatialIndex=no&amp;subsetIndex=no&amp;watchFile=no</datasource>
When openinng the project in QGIS 3.22.5 the datasource is not found, even it exists at the correct place.

  • it is not possible to "Auto-Find" or "Browse" the source - nothing happens on click (well on Auto-Find a small window appears and disappears with no content
  • when keepning all the unavailable layers on project opening it is not possible to "repair the datasource", because the files (extension *.aef) are not showing up and the "Original source iin the Repair Data Source Windows shows up with lots of duplicates:
    Original source URI: t&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=nott&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=no&t&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=nost&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=nokt&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=noit&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=nopt&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=noLt&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=noit&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=nont&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG:32636&spatialIndex=no&subsetIndex=no&watchFile=noet&skipLines=115&skipEmptyFields=Yes&maxFields=10000&detectTypes=yes&decimalPoint=,&xField=Center East&yField=Center North&crs=EPSG ...

If I add the CSV file into a fresh project on 3.22.5 the datasource is saved as:
<datasource>file:./aef/2und3k%20-%20Kopie.aef?encoding&amp;type=csv&amp;delimiter=%5Ct&amp;skipLines=115&amp;skipEmptyFields=Yes&amp;maxFields=10000&amp;detectTypes=yes&amp;decimalPoint=,&amp;xField=Center%20East&amp;yField=Center%20North&amp;crs=EPSG:32636&amp;spatialIndex=no&amp;subsetIndex=no&amp;watchFile=no</datasource>

And works without problems.

Seems that there have been changes in datasource handling and 3.22.x does not convert the old datasource values correctly.

This results in QGIS projects that can not be used anymore and have to be created from scratch when upgrading to 3.22.x - I guess this is a critical bug

Steps to reproduce the issue

see Workflow described above

Versions

3.22.5

Supported QGIS version

  • I'm running a supported QGIS version according to the roadmap.

New profile

  • I tried with a new QGIS profile

Additional context

No response

@mlechner mlechner added the Bug Either a bug report, or a bug fix. Let's hope for the latter! label May 17, 2022
@mlechner
Copy link
Author

may be this is connected with #48422 (wich is closed, but not solved AFAIK)

@mlechner
Copy link
Author

mlechner commented May 17, 2022

@elpaso
Copy link
Contributor

elpaso commented May 20, 2022

@mlechner can you please attach the failing project and data that I can use for debugging?

@mlechner
Copy link
Author

@elpaso as it is not my project, but from colleagues, and the data is sensible, i can not provide you with the failing project or data. I can try to create a simple project with example data that is appropriate for repoducing the bug.

In the meantime I tried some thing with python, pointing out that a
v.setDataSource(v.dataProvider().dataSourceUri().replace('/t', '%5Ct'), v.name(), v.providerType())
seems to fix the problem. In the QGIS projectfile (*.qgs) the tab-delimiter is "\t" and results in
?type=csv&delimiter=/t& from QgsMapLayer.source()

May be this hints helps until I can create a sample project

@elpaso
Copy link
Contributor

elpaso commented May 20, 2022

@elpaso as it is not my project, but from colleagues, and the data is sensible, i can not provide you with the failing project or data. I can try to create a simple project with example data that is appropriate for repoducing the bug.

Maybe you can send the project with the CSV redacted (just leave a couple of rows with fake data in it), data are probably not relevant to the bug, I think it's in the datasource URI encoding.

@mlechner
Copy link
Author

see attached a minified example project (incl. *.qgs and data as *.zip) for reproducing the bug. just open in QGIS >= 3.22.5 (may be 3.22.x)

delimitedtext_example.zip

@elpaso
Copy link
Contributor

elpaso commented May 20, 2022

Thanks for sharing, I have no problems to open the sample file with master or 3.24.2.

Closing because I cannot reproduce the issue, it may be already fixed.

@elpaso elpaso closed this as not planned Won't fix, can't repro, duplicate, stale May 20, 2022
@mlechner
Copy link
Author

I can NOT reproduce the bug on 3.22.7 LTR on Linux/Ubuntu anymore (don't know about 3.22.6). I will check 3.22.7 LTR on M$ Windoof.

@ukulle
Copy link

ukulle commented May 24, 2022

It seems to me my similar problem/bug opening delimited text files was gone after moving the data files to a local drive like D:\data. The layer (resp. the original data file) loaded from a network drive was invalid every time I load the project again. The problem appeared after I updated from LTR 3.16 to 3.22.6 (Windows 10).

@mlechner
Copy link
Author

@ukulle : can you verify that the only change made was moving the project from a network drive to local? Our status of exploring at the moment is like following:

  • QGIS 3.22.5 / Win10 - not working
  • QGIS 3.226 / Win10 - wokrking (but has to be checked again)
  • QGIS 3.22.7 / Win10 - not working
  • QGIS 3.22.7 / Ubuntu 22.04 - working

I will post more details tomorrow, as this is not fully clearified for me, yet.

Everything seems to break with the delimiter in the project being interpreted as "\t", "/t" or "%5Ct" while it definitely is "delimiter=\t&" in the orginal *.qgs file.

@ukulle
Copy link

ukulle commented May 24, 2022

Sorry, it was worded a bit misleadingly (corrected above now). I left the project file on a network drive and moved the data files to a local drive. I have simple data (points) but they are constantly updated with a text editor. For the workaround ("local drive"), I had to import the data files that were previously moved to D:\data for the layer once and save the project file.

I have tested with different delimiters (see file names below).

`
using mixed delimiters

id latitude longitude
5746 49.906009;6.540942
5747 49.888455 6.544912
5748 49.870383;6.544483
`

QGIS 3.22.7 (mixed delimiters: blank, tab, semicolon) source network drive -NOT working
... source="./data/layer-by-delimited-text-delimiters-mixed.txt?type=csv&amp;delimiter= \t;&amp; ...
QGIS 3.22.7 (mixed delimiters: blank, tab, semicolon) source local drive - working
... source="file:///D:/data/layer-by-delimited-text-delimiters-mixed.txt?type=csv&amp;delimiter=%20%5Ct;&amp; ...
QGIS 3.22.7 (delimiter tab only) source network drive - NOT working
... source="./data/layer-by-delimited-text-delimiters-tab-only.txt?type=csv&amp;delimiter=\t&amp; ...
QGIS 3.22.7 (delimiter tab only) source local drive - working
... source="file:///D:/data/layer-by-delimited-text-delimiters-tab-only.txt?type=csv&amp;delimiter=%5Ct&amp; ...

No idea how this is caused, especially since it always worked in QGIS version 3.16

@mlechner
Copy link
Author

mlechner commented May 25, 2022

@ukulle, @elpaso there definitely seems to be a bug on QGIS with handling on tab-delimiters (\t) just on Windows!
I added more example projects and all work on Linux while the projects only work on Windows when the \t delimiter manually has been edited to %5Ct - this affects 3.22.5 and still 3.22.7 but Windows only!

The only working example-projects are

  • example_encodedDelimiter.qgs
  • example_withFailure-onlyEncDeli.qgs
    all the others fail to open.

delimitedtext_example.zip

In addition I can say that the hovertext on the defect layer in the layer tree prits "/t" being the delimiter even "\t" is in the *.qgs file!

@elpaso - could you please reopen this issue, as it definitely is not fixed yet, but only seems to affect Windows-versions.

@elpaso elpaso reopened this May 25, 2022
@elpaso elpaso added the Windows Related to Windows operating system label May 25, 2022
@elpaso elpaso removed their assignment May 25, 2022
@agiudiceandrea
Copy link
Contributor

agiudiceandrea commented May 25, 2022

It seems to me the issue does occur on Windows with any QGIS version >= 3.22.0, while it doesn't occur with QGIS versions <= 3.20.3 (it this case, the only non working project is example_changedslash2).

@mlechner
Copy link
Author

mlechner commented Jun 2, 2022

Until this is fixed in QGIS core I created a small experimental Plugin to fix a projects layers.
See https://github.com/OpenBfS/bfs_delimitedtext_pathfixer
The Plugin just replaces the delimiter in datasource "/t" with "%5Ct" and sets the project dirty.

@mlechner
Copy link
Author

mlechner commented Jun 2, 2022

It seems to me the issue does occur on Windows with any QGIS version >= 3.22.0, while it doesn't occur with QGIS versions <= 3.20.3 (it this case, the only non working project is example_changedslash2).

that is correct. I just tried out several alternative slash/encoding configurations in the example file, without being sure that all of these can occur in practice. But it definitely can be said, that the problem can be narrowed to wrong parsing on Windows of "delimiter=\t" in the datasource definition in the project file. Well, the source() string of the layer is "delimiter=/t" on Windows, even thereis a backslash used in the project file (opened with an editor). So I guess fixing the parsing of this specific substring should fix QGIS. But that is far from my knowledge of QGIS code.

@mlechner
Copy link
Author

The QGIS Plugin repository does have an experimental Plugin trying to fix this problem. Just as a temporarily workaround until the problem itself is fixed. See

https://plugins.qgis.org/plugins/bfs_delimitedtext_pathfixer

@grandebooz
Copy link

i found that changing the datasource section in the file qgs
from
"./datafile.txt...."
to
"file:./CAVALLOTTI.txt...."
Causes it to be loaded correctly!!!!

Just dont care about delimiter, they work well both , 'file:' is the key !

@grandebooz
Copy link

or may be an other way to workaround...

@pathmapper
Copy link
Contributor

Came across a similar issue which looks like it's related to an umlaut in a header field of the CSV.

Minimal test case in #49186.

@pathmapper
Copy link
Contributor

I think it's in the datasource URI encoding.

I think so, too.

@agiudiceandrea
Copy link
Contributor

agiudiceandrea commented Jun 28, 2022

a similar issue which looks like it's related to an umlaut in a header field of the CSV

It seems the data files provided by @mlechner don't have any umlaut in the header fields names.

@pathmapper
Copy link
Contributor

Right, I meant it is similar in the way that it seems to be related to source URL encoding/decoding and Windows specific.

@mlechner
Copy link
Author

@elpaso at at the QGIS Changelog for 3.26 you wrote that it "Works for me on 3.24 and master". As this Bug is OS-specific (Windows only), did it work for you on Win/3.24 Win/master?
Because the affected 3.22.x is LTR, would this call for a backport/fix 3.22.x LTR?

@ed76
Copy link

ed76 commented Feb 14, 2023

It seems the problem has still not been solved. When will this error finally be fixed? at the moment its not possible to use a qgis version >3.16.

@mlechner
Copy link
Author

It seems the problem has still not been solved. When will this error finally be fixed? at the moment its not possible to use a qgis version >3.16.

Hi ed76, I can not fox the problem in 3.16 core (as I am not deep in the core code), but did you try my experimental plugin #48587 (comment) as a dirty workaround?

Anyhow, I hope the bug will be fixed from one of the core developers who know how to fix it.

@elpaso
Copy link
Contributor

elpaso commented Feb 14, 2023

@mlechner , @ed76 I am happy to have another look at the issue (on windows using current QGIS master) but I am having problems to understand how to reproduce it: I downloaded the last sample files from #48587 (comment), I created a new project and added the data from example.aef using the datasource manager and the delimited text provider and I am not having any issue re-opening the project.

The delimiter is correctly encoded ad %5Ct in the datasource in the QGS project file.

@elpaso elpaso added the Feedback Waiting on the submitter for answers label Feb 14, 2023
@mlechner
Copy link
Author

@elpaso (@ed76) it does not happen when just working on a recent version. The problem occurs when you open a project that has been created with 3.16.x QGIS on Windows with a current version on Windows. Then the CSV-Layers in the project can not be found anymore if the delimiter is a tab. The problem has been detected when 3.22.x became LTR and a bunch of existing 3.16.x (former LTR) projects did and do not open without problems in the current LTR version. I am not effected personally, but colleagues of mine are (and other users as well, I guess - e.g. @ed76) and I think this is a problematic bug, as the normal QGIS update procedure within the LTR slot results in broken projects!

@elpaso
Copy link
Contributor

elpaso commented Feb 14, 2023

@mlechner thank you for clarifying. So it is not related to the data being on a network folder?

@elpaso elpaso self-assigned this Feb 14, 2023
@ukulle
Copy link

ukulle commented Feb 14, 2023

@elpaso (@ed76 @mlechner) - the reproduction of behavior seems to be complex. Maybe I have a special case here.

  • I created a new project (QGIS 3.22.14) and added the data from example.aef using the datasource manager and the delimited text provider.
  • I saved the project (project and datasource on a networl drive!).
  • I am not having issue re-opening the project (NOTE: always opening from main menu by STRG + O)
  • I open the project.qgz file by double click now no problem so far.
  • This is important - I save the project file.
  • I open the project.qgz file - and the issue occurs!

The difference found in the project files:
new created:
<datasource>file:./data/example.aef?type=csv&amp;delimiter=%5Ct&amp;skipLines=8&amp; ...
after opened by double click:
<datasource>./data/example.aef?type=csv&amp;delimiter=\t&amp;skipLines=8&amp; ...

In case the datasource is located on a local drive exactly this problem part does not occur.
different from above but same in the described test case:
<datasource>file:///D:/_temp/example.aef?type=csv&amp;delimiter=%5Ct&amp;skipLines=8&amp; ...

I hope this can be reproduced and am looking forward to the feedback.

@elpaso
Copy link
Contributor

elpaso commented Feb 14, 2023

I think I have found the problem.

elpaso added a commit to elpaso/QGIS that referenced this issue Feb 15, 2023
when importing old projects, delimiter might be unencoded.

Fix qgis#48587
@elpaso
Copy link
Contributor

elpaso commented Feb 15, 2023

@ukulle I have fixed the encoding problem for local projects with #51881 , the network problem might be a different one.

If my PR does not fix the network problem please file a new issue (search if a similar issue does not yet exist first).

nyalldawson pushed a commit that referenced this issue Feb 21, 2023
when importing old projects, delimiter might be unencoded.

Fix #48587
qgis-bot pushed a commit that referenced this issue Feb 21, 2023
when importing old projects, delimiter might be unencoded.

Fix #48587
nyalldawson pushed a commit that referenced this issue Feb 21, 2023
when importing old projects, delimiter might be unencoded.

Fix #48587
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Crash/Data Corruption Data Source Manager Feedback Waiting on the submitter for answers Project Windows Related to Windows operating system
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants