Skip to content

Commit fab2c57

Browse files
committed
Merge pull request #527 from ccrook/master
Fix of delimited text provider to handle CSV files including quoted newlines properly
2 parents 82b41db + 632bfbb commit fab2c57

21 files changed

+3886
-1021
lines changed

src/core/qgsvectorlayer.h

Lines changed: 259 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -142,8 +142,252 @@ struct CORE_EXPORT QgsVectorJoinInfo
142142

143143

144144
/** \ingroup core
145-
* Vector layer backed by a data source provider.
145+
* Represents a vector layer which manages a vector based data sets.
146+
*
147+
* The QgsVectorLayer is instantiated by specifying the name of a data provider,
148+
* such as postgres or wfs, and url defining the specific data set to connect to.
149+
* The vector layer constructor in turn instantiates a QgsVectorDataProvider subclass
150+
* corresponding to the provider type, and passes it the url. The data provider
151+
* connects to the data source.
152+
*
153+
* The QgsVectorLayer provides a common interface to the different data types. It also
154+
* manages editing transactions.
155+
*
156+
* Sample usage of the QgsVectorLayer class:
157+
*
158+
* \code
159+
* QString uri = "point?crs=epsg:4326&field=id:integer";
160+
* QgsVectorLayer *scratchLayer = new QgsVectorLayer(uri, "Scratch point layer", "memory");
161+
* \endcode
162+
*
163+
* The main data providers supported by QGis are listed below.
164+
*
165+
* \section providers Vector data providers
166+
*
167+
* \subsection memory Memory data providerType (memory)
168+
*
169+
* The memory data provider is used to construct in memory data, for example scratch
170+
* data or data generated from spatial operations such as contouring. There is no
171+
* inherent persistent storage of the data. The data source uri is constructed. The
172+
* url specifies the geometry type ("point", "linestring", "polygon",
173+
* "multipoint","multilinestring","multipolygon"), optionally followed by url parameters
174+
* as follows:
175+
*
176+
* - crs=definition
177+
* Defines the coordinate reference system to use for the layer.
178+
* definition is any string accepted by QgsCoordinateReferenceSystem::createFromString()
179+
*
180+
* - index=yes
181+
* Specifies that the layer will be constructed with a spatial index
182+
*
183+
* - field=name:type(length,precision)
184+
* Defines an attribute of the layer. Multiple field parameters can be added
185+
* to the data provider definition. type is one of "integer", "double", "string".
186+
*
187+
* An example url is "Point?crs=epsg:4326&field=id:integer&field=name:string(20)&index=yes"
188+
*
189+
* \subsection ogr OGR data provider (ogr)
190+
*
191+
* Accesses data using the OGR drivers (http://www.gdal.org/ogr/ogr_formats.html). The url
192+
* is the OGR connection string. A wide variety of data formats can be accessed using this
193+
* driver, including file based formats used by many GIS systems, database formats, and
194+
* web services. Some of these formats are also supported by custom data providers listed
195+
* below.
196+
*
197+
* \subsection spatialite Spatialite data provider (spatialite)
198+
*
199+
* Access data in a spatialite database. The url defines the connection parameters, table,
200+
* geometry column, and other attributes. The url can be constructed using the
201+
* QgsDataSourceURI class.
202+
*
203+
* \subsection postgres Postgresql data provider (postgres)
204+
*
205+
* Connects to a postgresql database. The url defines the connection parameters, table,
206+
* geometry column, and other attributes. The url can be constructed using the
207+
* QgsDataSourceURI class.
208+
*
209+
* \subsection mssql Microsoft SQL server data provider (mssql)
210+
*
211+
* Connects to a Microsoft SQL server database. The url defines the connection parameters, table,
212+
* geometry column, and other attributes. The url can be constructed using the
213+
* QgsDataSourceURI class.
214+
*
215+
* \subsection sqlanywhere SQL Anywhere data provider (sqlanywhere)
216+
*
217+
* Connects to an SQLanywhere database. The url defines the connection parameters, table,
218+
* geometry column, and other attributes. The url can be constructed using the
219+
* QgsDataSourceURI class.
220+
*
221+
* \subsection wfs WFS (web feature service) data provider (wfs)
222+
*
223+
* Used to access data provided by a web feature service.
224+
*
225+
* The url can be a HTTP url to a WFS 1.0.0 server or a GML2 data file path.
226+
* Examples are http://foobar/wfs or /foo/bar/file.gml
227+
*
228+
* If a GML2 file path is provided the driver will attempt to read the schema from a
229+
* file in the same directory with the same basename + “.xsd”. This xsd file must be
230+
* in the same format as a WFS describe feature type response. If no xsd file is provide
231+
* then the driver will attempt to guess the attribute types from the file.
232+
*
233+
* In the case of a HTTP URL the ‘FILTER’ query string parameter can be used to filter
234+
* the WFS feature type. The ‘FILTER’ key value can either be a QGIS expression
235+
* or an OGC XML filter. If the value is set to a QGIS expression the driver will
236+
* turn it into OGC XML filter before passing it to the WFS server. Beware the
237+
* QGIS expression filter only supports” =, != ,<,> ,<= ,>= ,AND ,OR ,NOT, LIKE, IS NULL”
238+
* attribute operators, “BBOX, Disjoint, Intersects, Touches, Crosses, Contains, Overlaps, Within”
239+
* spatial binary operators and the QGIS local “geomFromWKT, geomFromGML”
240+
* geometry constructor functions.
241+
*
242+
* Also note:
243+
*
244+
* - You can use various functions available in the QGIS Expression list,
245+
* however the function must exist server side and have the same name and arguments to work.
246+
*
247+
* - Use the special $geometry parameter to provide the layer geometry column as input
248+
* into the spatial binary operators e.g intersects($geometry, geomFromWKT('POINT (5 6)'))
249+
*
250+
* \subsection delimitedtext Delimited text file data provider (delimitedtext)
251+
*
252+
* Accesses data in a delimited text file, for example CSV files generated by
253+
* spreadsheets. The contents of the file are split into columns based on specified
254+
* delimiter characters. Each record may be represented spatially either by an
255+
* X and Y coordinate column, or by a WKT (well known text) formatted columns.
256+
*
257+
* The url defines the filename, the formatting options (how the
258+
* text in the file is divided into data fields, and which fields contain the
259+
* X,Y coordinates or WKT text definition. The options are specified as url query
260+
* items.
261+
*
262+
* At its simplest the url can just be the filename, in which case it will be loaded
263+
* as a CSV formatted file.
264+
*
265+
* The url may include the following items:
266+
*
267+
* - encoding=UTF-8
268+
*
269+
* Defines the character encoding in the file. The default is UTF-8. To use
270+
* the default encoding for the operating system use "System".
271+
*
272+
* - type=(csv|regexp|whitespace|plain)
273+
*
274+
* Defines the algorithm used to split records into columns. Records are
275+
* defined by new lines, except for csv format files for which quoted fields
276+
* may span multiple records. The default type is csv.
277+
*
278+
* - "csv" splits the file based on three sets of characters:
279+
* delimiter characters, quote characters,
280+
* and escape characters. Delimiter characters mark the end
281+
* of a field. Quote characters enclose a field which can contain
282+
* delimiter characters, and newlines. Escape characters cause the
283+
* following character to be treated literally (including delimiter,
284+
* quote, and newline characters). Escape and quote characters must
285+
* be different from delimiter characters. Escape characters that are
286+
* also quote characters are treated specially - they can only
287+
* escape themselves within quotes. Elsewhere they are treated as
288+
* quote characters. The defaults for delimiter, quote, and escape
289+
* are ',', '"', '"'.
290+
* - "regexp" splits each record using a regular expression (see QRegExp
291+
* documentation for details).
292+
* - "whitespace" splits each record based on whitespace (on or more whitespace
293+
* characters. Leading whitespace in the record is ignored.
294+
* - "plain" is provided for backwards compatibility. It is equivalent to
295+
* CSV except that the default quote characters are single and double quotes,
296+
* and there is no escape characters.
297+
*
298+
* - delimiter=characters
299+
*
300+
* Defines the delimiter characters used for csv and plain type files, or the
301+
* regular expression for regexp type files. It is a literal string of characters
302+
* except that "\t" may be used to represent a tab character.
303+
*
304+
* - quote=characters
305+
*
306+
* Defines the characters that are used as quote characters for csv and plain type
307+
* files.
308+
*
309+
* - escape=characters
310+
*
311+
* Defines the characters used to escape delimiter, quote, and newline characters.
312+
*
313+
* - skipEmptyFields=(yes|no)
314+
*
315+
* If yes then empty fields will be discarded (eqivalent to concatenating consecutive
316+
* delimiters)
317+
*
318+
* - trimFields=(yes|no)
319+
*
320+
* If yes then leading and trailing whitespace will be removed from fields
321+
*
322+
* - skipLines=n
323+
*
324+
* Defines the number of lines to ignore at the beginning of the file (default 0)
325+
*
326+
* - useHeader=(yes|no)
327+
*
328+
* Defines whether the first record in the file (after skipped lines) contains
329+
* column names (default yes)
330+
*
331+
* - xField=column yField=column
332+
*
333+
* Defines the name of the columns holding the x and y coordinates for XY point geometries.
334+
* If the useHeader is no (ie there are no column names), then this is the column
335+
* number (with the first column as 1).
336+
*
337+
* - decimalPoint=c
338+
*
339+
* Defines a character that is used as a decimal point in the X and Y columns.
340+
* The default is '.'.
341+
*
342+
* - xyDms=(yes|no)
343+
*
344+
* If yes then the X and Y coordinates are interpreted as
345+
* degrees/minutes/seconds format (fairly permissively),
346+
* or degree/minutes format.
347+
*
348+
* - wktField=column
349+
*
350+
* Defines the name of the columns holding the WKT geometry definition for WKT geometries.
351+
* If the useHeader is no (ie there are no column names), then this is the column
352+
* number (with the first column as 1).
353+
*
354+
* - geomType=(point|line|polygon|none)
355+
*
356+
* Defines the geometry type for WKT type geometries. QGis will only display one
357+
* type of geometry for the layer - any others will be ignored when the file is
358+
* loaded. By default the provider uses the type of the first geometry in the file.
359+
* Use geomType to override this type.
360+
*
361+
* geomType can also be set to none, in which case the layer is loaded without
362+
* geometries.
363+
*
364+
* - crs=crsstring
365+
*
366+
* Defines the coordinate reference system used for the layer. This can be
367+
* any string accepted by QgsCoordinateReferenceSystem::createFromString()
368+
*
369+
* - quiet
370+
*
371+
* Errors encountered loading the file will not be reported in a user dialog if
372+
* quiet is included (They will still be shown in the output log).
373+
*
374+
* \subsection gpx GPX data provider (gpx)
375+
*
376+
* Provider reads tracks, routes, and waypoints from a GPX file. The url
377+
* defines the name of the file, and the type of data to retrieve from it
378+
* ("track", "route", or "waypoint").
379+
*
380+
* An example url is "/home/user/data/holiday.gpx?type=route"
381+
*
382+
* \subsection grass Grass data provider (grass)
383+
*
384+
* Provider to display vector data in a GRASS GIS layer.
385+
*
386+
*
387+
*
146388
*/
389+
390+
147391
class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
148392
{
149393
Q_OBJECT
@@ -235,7 +479,18 @@ class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
235479
QList<GroupData> mGroups;
236480
};
237481

238-
/** Constructor */
482+
/** Constructor - creates a vector layer
483+
*
484+
* The QgsVectorLayer is constructed by instantiating a data provider. The provider
485+
* interprets the supplied path (url) of the data source to connect to and access the
486+
* data.
487+
*
488+
* @param path The path or url of the parameter. Typically this encodes
489+
* parameters used by the data provider as url query items.
490+
* @param baseName The name used to represent the layer in the legend
491+
* @param providerLib The name of the data provider, eg "memory", "postgres"
492+
*
493+
*/
239494
QgsVectorLayer( QString path = QString::null, QString baseName = QString::null,
240495
QString providerLib = QString::null, bool loadDefaultStyleFlag = true );
241496

@@ -337,7 +592,7 @@ class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
337592
* @see deselect(QgsFeatureIds)
338593
* @see deselect(QgsFeatureId)
339594
*/
340-
void modifySelection(QgsFeatureIds selectIds, QgsFeatureIds deselectIds );
595+
void modifySelection( QgsFeatureIds selectIds, QgsFeatureIds deselectIds );
341596

342597
/** Select not selected features and deselect selected ones */
343598
void invertSelection();
@@ -940,7 +1195,7 @@ class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
9401195
*
9411196
* @see deselect(QgsFeatureId)
9421197
*/
943-
void deselect(const QgsFeatureIds& featureIds );
1198+
void deselect( const QgsFeatureIds& featureIds );
9441199

9451200
/**
9461201
* Clear selection

src/providers/delimitedtext/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
SET (DTEXT_SRCS
66
qgsdelimitedtextfeatureiterator.cpp
77
qgsdelimitedtextprovider.cpp
8+
qgsdelimitedtextfile.cpp
89
qgsdelimitedtextsourceselect.cpp
910
)
1011

0 commit comments

Comments
 (0)