Skip to content

Commit

Permalink
Merge pull request #532 from ccrook/delimited_text_bug_fixes
Browse files Browse the repository at this point in the history
Three delimited text bug fixes, GUI tidyup, context help fix
  • Loading branch information
timlinux committed Apr 17, 2013
2 parents 22e54b9 + e1421f5 commit 4fa44cf
Show file tree
Hide file tree
Showing 20 changed files with 1,070 additions and 790 deletions.
59 changes: 0 additions & 59 deletions resources/context_help/QgsDelimitedTextPluginGui-en_US

This file was deleted.

172 changes: 172 additions & 0 deletions resources/context_help/QgsDelimitedTextSourceSelect-en_US
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
<h3>Delimited Text File Layer</h3>
Loads and displays delimited text files
<p>
<a href="#re">Overview</a><br/>
<a href="#creating">Creating a delimited text layer</a><br/>
<a href="#csv">How the delimiter, quote, and escape characters work</a><br />
<a href="#regexp">How regular expression delimiters work</a><br />
<a href="#wkt">How WKT text is interpreted</a><br />
<a href="#example">Example of a text file with X,Y point coordinates</a><br/>
<a href="#wkt_example">Example of a text file with WKT geometries</a><br/>
<a href="#notes">Notes</a><br/>
</p>

<h4><a name="re">Overview</a></h4>
<p>A &quot;delimited text file&quot; contains data in which each record starts on a new line, and
is split into fields by a delimiter such as a comma.
This type of file is commonly exported from spreadsheets (for example CSV files) or databases.
Typically the first line of a delimited text file contains the names of the fields.
</p>
<p>
Delimited text files can be loaded into QGIS as a layer.
The records can be displayed spatially either as a point
defined by X and Y coordinates, or using a Well Known Text (WKT) definition of a geometry which may
describe points, lines, and polygons of arbitrary complexity. The file can also be loaded as an attribute
only table, which can then be joined to other tables in QGis.
</p>
<p>
In addition to the geometry definition the file can contain text, integer, and real number fields. QGis
will choose the type of field based on its contents.
</p>
<h4><a name="creating">Creating a delimited text layer</a></h4>
<p>Creating a delimited text layer involves choosing the data file, defining the format (how each record is to
be split into fields), and defining the geometry is represented.
This is managed with the delimited text dialog as detailed below.
The dialog box displays a sample from the beginning of the file which shows how the format
options have been applied.
</p>
<h5>Choosing the data file</h5>
<p>Use the &quot;Browse...&quot; button to select the data file. Once the file is selected the
layer name will automatically be populated based on the file name. The layer name is used to represent
the data in the QGis legend.
</p>
<p>
By default files are assumed to be encoded as UTF-8. However other file
encodings can be selected. For example &quot;System&quot; uses the default encoding for the operating system.
If you are expecting to move the QGis project then it is safer to use a specific encoding.
</p>
<h5>Specifying the file format</h5>
<p>The file format can be one of
<ul>
<li>CSV file format. This is a format commonly used by spreadsheets, in which fields are delimited
by a comma character, and quoted using a &quot;(quote) character. Within quoted fields, a quote
mark is entered as &quot;&quot;.</li>
<li>Selected delimiters. Each record is split into fields using one or more delimiter character.
Quote characters are used for fields which may contain delimiters. Escape characters may be used
to treat the following character as a normal character (ie to include delimiter, quote, and
new line characters in text fields). The use of delimiter, quote, and escape characters is detailed <a href="#csv">below</a>.
<li>Regular expression. Each line is split into fields using a &quot;regular expression&quot; delimiter.
The use of regular expressions is details <a href="#regexp">below</a>.
</ul>
<h5>Record and field options</h5>
<p>The following options affect the selection of records and fields from the data file</p>
<ul>
<li>Number of header lines to discard: used to skip over header lines at the beginning of the text file</li>
<li>First record has fields names: if selected then the first record in the file (after the skipped lines) is interpreted as names of fields, rather than as a data record.</li>
<li>Trim fields: if selected then leading and trailing whitespace characters will be removed from each field (except quoted fields). </li>
<li>Discard empty fields: if selected then empty fields (after trimming) will be discard. This
affects the alignment of data into fields and is equivalent to treating consecutive delimiters as a
single delimiter. Quoted fields are never discarded.</li>
<li>Decimal point is comma: if selected then commas in real numbers represent the decimal point. For
example &quot;-51,354&quot; is equivalent to -51.354.
</li>
</ul>
<h5>Geometry definition</h5>
<p>The geometry is can be define as one of</p>
<ul>
<li>Point coordinates: each feature is represented as a point defined by X and Y coordinates.</li>
<li>Well known text (WKT) geometry: each feature is represented as a well known text string, for example
&quot;POINT(1.525622 51.20836)&quot;. See details of the <a href="#wkt">well known text</a> format.
<li>No geometry (attribute only table): records will not be displayed on the map, but can be viewed
in the attribute table and joined to other layers in QGis</li>
</ul>
<p>For point coordinates the following options apply:</p>
<ul>
<li>X field: specifies the field containing the X coordinate</li>
<li>Y field: specifies the field containing the Y coordinate</li>
<li>DMS angles: if selected coordinates are represented as degrees/minutes/seconds
or degrees/minutes. QGis is quite permissive in its interpretation of degrees/minutes/seconds.
A valid DMS coordinate will contain three numeric fields with an optional hemisphere prefix or suffix
(N, E, or + are positive, S, W, or - are negative). Additional non numeric characters are
generally discarded. For example &quot;N41d54'01.54&quot;&quot; is a valid coordinate.
</li>
</ul>
<p>For well known text geometry the following options apply:</p>
<ul>
<li>Geometry field: the field containing the well known text definition.</li>
<li>Geometry type: one of &quot;Detect&quot; (detect), &quot;Point&quot;, &quot;Line&quot;, or &quot;Polygon&quot;.
QGis layers can only display one type of geometry feature (point, line, or polygon). This option selects
which geometry type is displayed in text files containing multiple geometry types. Records containing
other geometry types are discarded.
If &quot;Detect&quot; is selected then the type of the first geometry in the file will be used.
&quot;Point&quot; includes POINT and MULTIPOINT WKT types, &quot;Line&quot; includes LINESTRING and
MULTLINESTRING WKT types, and &quot;Polygon&quot; includes POLYGON and MULTIPOLYGON WKT types.
</ul>

<h4><a name="csv">How the delimiter, quote, and escape characters work</a></h4>
<p>Records are split into fields using three character sets: delimiter characters, quote characters,
and escape characters. Quote and escape characters cannot be the same as delimiter characters - they
will be ignored if they are. Escape characters can be the same as quote characters, but behave differently
if they are.</p>
<p>The delimiter characters are used to mark the end of each field. If more than one delimiter character
is defined then any one of the characters can mark the end of a field. The quote and escape characters
can override the delimiter character, so that it is treated as a normal character.</p>
<p>Quote characters may be used to mark the beginning and end of quoted fields. Quoted fields can
contain delimiters and may span multiple lines in the text file. If a field is quoted then it must
start and end with the same quote character. Quote characters cannot occur within a field unless they
are escaped.</p>
<p>Escape characters which are not quote characters force the following character to be treated normally
(that is, to stop it being treated as a new line, delimiter, or quote character).
</p>
<p>If a quote character is also an escape character, then it can be represented in a quoted field by
entering it twice. For example if ' is a quote character and an escape character, then the string
'Smith''s&nbsp;Creek' will represent the value Smith's&nbsp;Creek.
</p>
<h4><a name="regexp">How regular expression delimiters work</a></h4>
<p>Regular expressions are mini-language used to represent character patterns. There are many variations
of regular expression syntax - QGis uses the syntax provided by the <a href="http://qt-project.org/doc/qt-4.8/qregexp.html">QRegExp</a> class of the <a href="http://qt.digia.com">Qt</a> framework.</p>
<p>In a regular expression delimited file each line is treated as a record. Each match of the regular expression in the line is treated as the end of a field.</p>

<h4><a name="wkt">How WKT text is interpreted</a></h4>
<p>
The delimited text layer recognizes the following
<a href="http://en.wikipedia.org/wiki/Well-known_text">well known text</a> types -
POINT, MULTIPOINT, LINESTRING, MULTILINESTRING, POLYGON, and MULTIPOLYGON. It will accept geometries with
a Z coordinate (eg &quot;POINT&nbsp;Z&quot;), a measure (&quot;POINT&nbsp;M&quot;), or both (&quot;POINT&nbsp;ZM&quot;).
</p>
<p>
It can also handle the PostGIS EWKT variation, in which the geomtry is preceded by an spatial reference
system id (eg &quot;SRID=4326;POINT(175.3&nbsp;41.2)&quot;), and a variant used by Informix in which the WKT is
preceded by an integer spatial reference id (eg &quot;1 POINT(175.3&nbsp;41.2)&quot;).
In both cases the SRID is ignored.
</p>
<h4><a name="example">Example of a text file with X,Y point coordinates</a></h4>
<pre>
X;Y;ELEV<br />
-300120;7689960;13<br />
-654360;7562040;52<br />
1640;7512840;3<br />
</pre>
<p>This file:</p>
<ul>
<li> Uses <b>;</b> as delimiter. Any character can be used to delimit the fields.</li>
<li>The first row is the header row. It contains the field names X, Y and ELEV.</li>
<li>No quotes (") are used to delimit text fields.</li>
<li>The x coordinates are contained in the X field.</li>
<li>The y coordinates are contained in the Y field.</li>
</ul>
<h4><a name="wkt_example">Example of a text file with WKT geometries</a></h4>
<pre>
id|wkt<br />
1|POINT(172.0702250 -43.6031036)<br />
2|POINT(172.0702250 -43.6031036)<br />
3|POINT(172.1543206 -43.5731302)<br />
4|POINT(171.9282585 -43.5493308)<br />
5|POINT(171.8827359 -43.5875983)<br />
</pre>
<p>This file:</p>
<ul>
<li>Has two fields defined in the header row: id and wkt.
<li>Uses <b>|</b> as a delimiter.</li>
<li>Specifies each point using the WKT notation
</ul>
2 changes: 1 addition & 1 deletion src/core/qgsvectorlayer.h
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ struct CORE_EXPORT QgsVectorJoinInfo
*
* - decimalPoint=c
*
* Defines a character that is used as a decimal point in the X and Y columns.
* Defines a character that is used as a decimal point in the numeric columns
* The default is '.'.
*
* - xyDms=(yes|no)
Expand Down
18 changes: 13 additions & 5 deletions src/providers/delimitedtext/qgsdelimitedtextfeatureiterator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -216,16 +216,24 @@ void QgsDelimitedTextFeatureIterator::fetchAttribute( QgsFeature& feature, int f
switch ( P->attributeFields[fieldIdx].type() )
{
case QVariant::Int:
if ( !value.isEmpty() )
val = QVariant( value );
else
if ( value.isEmpty() )
val = QVariant( P->attributeFields[fieldIdx].type() );
else
val = QVariant( value );
break;
case QVariant::Double:
if ( !value.isEmpty() )
if ( value.isEmpty() )
{
val = QVariant( P->attributeFields[fieldIdx].type() );
}
else if ( P->mDecimalPoint.isEmpty() )
{
val = QVariant( value.toDouble() );
}
else
val = QVariant( P->attributeFields[fieldIdx].type() );
{
val = QVariant( QString( value ).replace( P->mDecimalPoint, "." ).toDouble() );
}
break;
default:
val = QVariant( value );
Expand Down
4 changes: 4 additions & 0 deletions src/providers/delimitedtext/qgsdelimitedtextprovider.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,10 @@ QgsDelimitedTextProvider::QgsDelimitedTextProvider( QString uri )
}
if ( couldBeDouble[fieldPos] )
{
if ( ! mDecimalPoint.isEmpty() )
{
value.replace( mDecimalPoint, "." );
}
value.toDouble( &couldBeDouble[fieldPos] );
}
fieldPos++;
Expand Down
15 changes: 10 additions & 5 deletions src/providers/delimitedtext/qgsdelimitedtextsourceselect.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ QgsDelimitedTextSourceSelect::QgsDelimitedTextSourceSelect( QWidget * parent, Qt
mFile( new QgsDelimitedTextFile() ),
mExampleRowCount( 20 ),
mColumnNamePrefix( "Column_" ),
mPluginKey( "/Plugin-DelimitedText" )
mPluginKey( "/Plugin-DelimitedText" ),
mLastFileType("")
{

setupUi( this );
Expand Down Expand Up @@ -250,7 +251,7 @@ void QgsDelimitedTextSourceSelect::loadSettings( QString subkey, bool loadGeomSe
QString encoding = settings.value( key + "/encoding", "" ).toString();
if ( ! encoding.isEmpty() ) cbxEncoding->setCurrentIndex( cbxEncoding->findText( encoding ) );
QString delimiters = settings.value( key + "/delimiters", "" ).toString();
if ( delimiters.isEmpty() ) setSelectedChars( delimiters );
if ( ! delimiters.isEmpty() ) setSelectedChars( delimiters );

txtQuoteChars->setText( settings.value( key + "/quoteChars", "\"" ).toString() );
txtEscapeChars->setText( settings.value( key + "/escapeChars", "\"" ).toString() );
Expand All @@ -262,14 +263,14 @@ void QgsDelimitedTextSourceSelect::loadSettings( QString subkey, bool loadGeomSe
cbxUseHeader->setChecked( settings.value( key + "/useHeader", "true" ) != "false" );
cbxTrimFields->setChecked( settings.value( key + "/trimFields", "false" ) == "true" );
cbxSkipEmptyFields->setChecked( settings.value( key + "/skipEmptyFields", "false" ) == "true" );
cbxPointIsComma->setChecked( settings.value( key + "/decimalPoint", "." ).toString().contains( "," ) );

if ( loadGeomSettings )
{
QString geomColumnType = settings.value( key + "/geomColumnType", "xy" ).toString();
if ( geomColumnType == "xy" ) geomTypeXY->setChecked( true );
else if ( geomColumnType == "wkt" ) geomTypeWKT->setChecked( true );
else geomTypeNone->setChecked( true );
cbxPointIsComma->setChecked( settings.value( key + "/decimalPoint", "." ).toString().contains( "," ) );
cbxXyDms->setChecked( settings.value( key + "/xyDms", "false" ) == "true" );
}

Expand Down Expand Up @@ -297,13 +298,13 @@ void QgsDelimitedTextSourceSelect::saveSettings( QString subkey, bool saveGeomSe
settings.setValue( key + "/useHeader", cbxUseHeader->isChecked() ? "true" : "false" );
settings.setValue( key + "/trimFields", cbxTrimFields->isChecked() ? "true" : "false" );
settings.setValue( key + "/skipEmptyFields", cbxSkipEmptyFields->isChecked() ? "true" : "false" );
settings.setValue( key + "/decimalPoint", cbxPointIsComma->isChecked() ? "," : "." );
if ( saveGeomSettings )
{
QString geomColumnType = "none";
if ( geomTypeXY->isChecked() ) geomColumnType = "xy";
if ( geomTypeWKT->isChecked() ) geomColumnType = "wkt";
settings.setValue( key + "/geomColumnType", geomColumnType );
settings.setValue( key + "/decimalPoint", cbxPointIsComma->isChecked() ? "," : "." );
settings.setValue( key + "/xyDms", cbxXyDms->isChecked() ? "true" : "false" );
}

Expand All @@ -313,7 +314,10 @@ void QgsDelimitedTextSourceSelect::loadSettingsForFile( QString filename )
{
if ( filename.isEmpty() ) return;
QFileInfo fi( filename );
loadSettings( fi.suffix(), true );
QString filetype=fi.suffix();
// Don't expect to change settings if not changing file type
if( filetype != mLastFileType ) loadSettings( fi.suffix(), true );
mLastFileType = filetype;
}

void QgsDelimitedTextSourceSelect::saveSettingsForFile( QString filename )
Expand Down Expand Up @@ -476,6 +480,7 @@ void QgsDelimitedTextSourceSelect::updateFieldLists()

tblSample->setHorizontalHeaderLabels( fieldList );
tblSample->resizeColumnsToContents();
tblSample->resizeRowsToContents();

// We don't know anything about a text based field other
// than its name. All fields are assumed to be text
Expand Down
1 change: 1 addition & 0 deletions src/providers/delimitedtext/qgsdelimitedtextsourceselect.h
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ class QgsDelimitedTextSourceSelect : public QDialog, private Ui::QgsDelimitedTex
int mExampleRowCount;
QString mColumnNamePrefix;
QString mPluginKey;
QString mLastFileType;

private slots:
void on_buttonBox_accepted();
Expand Down
Loading

0 comments on commit 4fa44cf

Please sign in to comment.