-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integer unexpectedly large in GeoPackage #45178
Comments
Can you try running vacuum on that database? |
Thanks for pointing at that interesting tool coming along with the information that the DB can get fragmented. Didn't help much though as the vacuumed file with integer values (SMOD_POP_KG_DNI500_melilla-points_int_vacuumed.gpkg) is sill larger than original one with double values (SMOD_POP_KG_DNI500_melilla-points.gpkg) even when vacuumed (SMOD_POP_KG_DNI500_melilla-points_vacuumed.gpkg): |
Can you do a bit more analysis? Interesting would be
|
I cannot replicate the issue using QGIS 3.16.11 old (v1/EXE) OSGeo4W (SQLite 3.29.0) on Windows 7 64 bit.
|
I cannot replicate the issue also using QGIS 3.20.3 new (v2/MSI) OSGeo4W (SQLite 3.35.2) on Windows 10 64 bit. |
Referring to my previous statement: Thank you @agiudiceandrea for your replication efforts! What I did achieve on two independant machines (QGIS 3.20.1 on both) was to blow the file to 776KB (794.624 Bytes) when manually reversing DNI_2 to DNI_1. Meaning:
Though my intention was actually not to proof an error in reversability but this is probably something that should be looked at as well. I tried to do some investigation as proposed by @m-kuhn using https://www.sqlite.org/dbstat.html The following tables marked manually with (*) have had changes in data:
SMOD_POP_KG_DNI500_melilla-points_int.gpkg after reversing DNI_2 to DNI_1:
original SMOD_POP_KG_DNI500_melilla-points.gpkg:
Unfortunately I am not proficient enough with databases and even less with sqlite to dig down and compare the datasets that cause the trouble. Any hints how to dig deeper? Expanding and returning to my initial topic: Even if file sizes don't grow after converting |
Datatypes in SQLite are handled dynamically and they differ from common C++ static types: "In SQLite, the datatype of a value is associated with the value itself, not with its container." (See https://www.sqlite.org/datatype3.html) AFAIK, this means that the numeric integer value 6500 will be stored as a 2-bytes integer ("a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value") in both an INTEGER type and a REAL type column. But the numeric floating point value 6500.123 will be stored as a 8-byte IEEE floating point ("a floating point value, stored as an 8-byte IEEE floating point number") in both a REAL type and a INTEGER type column. |
Thank you so much @agiudiceandrea for pointing to https://www.sqlite.org/datatype3.html! That explains my main question. What a pity I couldn't replicate the original issue of a double->int converted and vacuumed DB file being larger than the original one so far. The file size increase after a conversion double->int and then reverting int->double is still unclear and could reveal some bad code. |
yes, I can also reproduce this. The compacted size goes from 778240 to 794624. Looking at the output of "echo .dump | sqlite3 the.db > the.db.txt" before original and the one after the 2 conversions, one can see that the order into which tables are stored in the file is not the same, and some differences in the spatial index. However at high level, the spatial index seems to be OK. I guess one must accept this as side effects of the compaction mechanism in SQLite3. Closing as not a bug |
What is the bug or the crash?
Common C++ types use 4 bytes for integer and 8 bytes for double.
Though expecting the file sizes to decrease after converting a double field to integer and removing the double field the converted file grows even larger.
The used file type is .gpkg. Haven't tried with any other file type.
File sizes in File manager:
Steps to reproduce the issue
When trying to reproduce this a second time I got a different but still larger file size of 860 kB. Yet the original resulting "corrupted" file SMOD_POP_KG_DNI500_melilla-points_int.gpkg (1.208kB) is contained in the archive. SMOD_POP_KG_DNI500_melilla_points.zip
Feature count and contents look exactly the same
Versions
QGIS version
3.20.2-Odense
QGIS code revision
9f59a15
Qt version
5.15.2
Python version
3.9.5
GDAL/OGR version
3.3.1
PROJ version
8.1.0
EPSG Registry database version
v10.027 (2021-06-17)
GEOS version
3.9.1-CAPI-1.14.2
SQLite version
3.35.2
PDAL version
2.3.0
PostgreSQL client version
13.0
SpatiaLite version
5.0.1
QWT version
6.1.3
QScintilla2 version
2.11.5
OS version
Windows 10 Version 2009
Active Python plugins
GroupStats
QuickOSM
db_manager
MetaSearch
processing
Supported QGIS version
New profile
Additional context
No response
The text was updated successfully, but these errors were encountered: