-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
/
gdal_virtual_file_systems.dox
834 lines (659 loc) · 38 KB
/
gdal_virtual_file_systems.dox
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
/*!
\page gdal_virtual_file_systems GDAL Virtual File Systems (compressed, network hosted, etc...): /vsimem, /vsizip, /vsitar, /vsicurl, ...
\section gdal_virtual_file_systems_toc Contents
<ol>
<li> \ref gdal_virtual_file_systems_intro
<li> \ref gdal_virtual_file_systems_chaining
<li> \ref gdal_virtual_file_systems_drivers
<li> \ref gdal_virtual_file_systems_vsizip
<li> \ref gdal_virtual_file_systems_vsigzip
<li> \ref gdal_virtual_file_systems_vsitar
<li> \ref gdal_virtual_file_systems_network
<ol>
<li> \ref gdal_virtual_file_systems_vsicurl
<li> \ref gdal_virtual_file_systems_vsicurl_streaming
<li> \ref gdal_virtual_file_systems_vsis3
<li> \ref gdal_virtual_file_systems_vsis3_streaming
<li> \ref gdal_virtual_file_systems_vsigs
<li> \ref gdal_virtual_file_systems_vsigs_streaming
<li> \ref gdal_virtual_file_systems_vsiaz
<li> \ref gdal_virtual_file_systems_vsiaz_streaming
<li> \ref gdal_virtual_file_systems_vsioss
<li> \ref gdal_virtual_file_systems_vsioss_streaming
<li> \ref gdal_virtual_file_systems_vsiswift
<li> \ref gdal_virtual_file_systems_vsiswift_streaming
<li> \ref gdal_virtual_file_systems_vsihdfs
</ol>
<li> \ref gdal_virtual_file_systems_vsistdin
<li> \ref gdal_virtual_file_systems_vsistdout
<li> \ref gdal_virtual_file_systems_vsimem
<li> \ref gdal_virtual_file_systems_subfile
<li> \ref gdal_virtual_file_systems_vsisparse
<li> \ref gdal_virtual_file_systems_vsicache
<li> \ref gdal_virtual_file_systems_vsicrypt
</ol>
\section gdal_virtual_file_systems_intro Introduction
GDAL can access files located on "standard" file systems, ie in the / hierarchy
on Unix-like systems or in C:\, D:\, etc... drives on Windows. But most GDAL
raster and vector drivers use a GDAL specific abstraction to access files. This
makes it possible to access less standard types of files, such as in-memory files,
compressed files (.zip, .gz, .tar, .tar.gz archives), encrypted files, files
stored on network (either publicly accessible, or in private buckets of commercial
cloud storage services), etc.
Each special file system has a prefix, and the general syntax to name a file is
/vsiPREFIX/...
Example:
<pre>
gdalinfo /vsizip/my.zip/my.tif
</pre>
\section gdal_virtual_file_systems_chaining Chaining
It is possible to chain multiple file system handlers.
ogrinfo a shapefile in a zip file on the internet:
<pre>
ogrinfo -ro -al -so /vsizip//vsicurl/https://raw.githubusercontent.com/OSGeo/gdal/master/autotest/ogr/data/poly.zip
</pre>
ogrinfo a shapefile in a zip file on an ftp:
<pre>
ogrinfo -ro -al -so /vsizip//vsicurl/ftp://user:password@example.com/foldername/file.zip/example.shp
</pre>
\section gdal_virtual_file_systems_drivers Drivers supporting virtual file systems
Virtual file systems can only be used with GDAL or OGR drivers supporting the
"large file API", which is now the vast majority of file based drivers. The
full list of these formats can be obtained by looking at the driver marked with
'v' when running either gdalinfo --formats or ogrinfo --formats.
Notable exceptions are the netCDF, HDF4 and HDF5 drivers.
\section gdal_virtual_file_systems_vsizip /vsizip/ (.zip archives)
/vsizip/ is a file handler that allows reading ZIP archives on-the-fly
without decompressing them beforehand.
To point to a file inside a zip file, the filename must be of the form
/vsizip/path/to/the/file.zip/path/inside/the/zip/file, where
path/to/the/file.zip is relative or absolute and path/inside/the/zip/file is
the relative path to the file inside the archive.
To use the .zip as a directory, you can use /vsizip/path/to/the/file.zip or
/vsizip/path/to/the/file.zip/subdir. Directory listing is available with
VSIReadDir(). A VSIStatL("/vsizip/...") call will return the uncompressed size
of the file. Directories inside the ZIP file can be distinguished from regular
files with the VSI_ISDIR(stat.st_mode) macro as for regular file systems.
Getting directory listing and file statistics are fast operations.
Note: in the particular case where the .zip file contains a single file located
at its root, just mentioning "/vsizip/path/to/the/file.zip" will work
Examples:
<pre>
/vsizip/my.zip/my.tif (relative path to the .zip)
/vsizip//home/even/my.zip/subdir/my.tif (absolute path to the .zip)
/vsizip/c:\\users\\even\\my.zip\\subdir\\my.tif
</pre>
.kmz, .ods and .xlsx extensions are also detected as valid extensions for
zip-compatible archives.
Starting with GDAL 2.2, an alternate syntax is available so as to enable
chaining and not being dependent on .zip extension :
/vsizip/{/path/to/the/archive}/path/inside/the/zip/file.
Note that /path/to/the/archive may also itself use this alternate syntax.
Write capabilities are also available. They allow creating
a new zip file and adding new files to an already existing (or just created)
zip file.
Read and write operations cannot be interleaved. The new zip must be
closed before being re-opened for read.
\section gdal_virtual_file_systems_vsigzip /vsigzip/ (gzipped file)
/vsigzip/ is a file handler that allows reading on-the-fly
in GZip (.gz) files without decompressing them priorly.
To view a gzipped file as uncompressed by GDAL, you must use the
/vsigzip/path/to/the/file.gz syntax, where
path/to/the/file.gz is relative or absolute
Examples:
<pre>
/vsigzip/my.gz (relative path to the .gz)
/vsigzip//home/even/my.gz (absolute path to the .gz)
/vsigzip/c:\\users\\even\\my.gz
</pre>
VSIStatL() will return the uncompressed file size, but this is potentially
a slow operation on large files, since it requires uncompressing the whole file.
Seeking to the end of the file, or at random locations, is similarly slow. To
speed up that process, "snapshots" are internally created in memory so as to
be able being able to seek to part of the files already decompressed in a faster
way. This mechanism of snapshots also apply to /vsizip/ files.
When the file is located in a writable location, a file with extension .gz.properties
is created with an indication of the uncompressed file size (the creation of that
file can be disabled by setting the CPL_VSIL_GZIP_WRITE_PROPERTIES configuration
option to NO).
\section gdal_virtual_file_systems_vsitar /vsitar/ (.tar, .tgz archives)
/vsitar/ is a file handler that allows reading on-the-fly
in regular uncompressed .tar or compressed .tgz or .tar.gz archives,
without decompressing them priorly.
To point to a file inside a .tar, .tgz .tar.gz file, the filename must be of the form
/vsitar/path/to/the/file.tar/path/inside/the/tar/file, where
path/to/the/file.tar is relative or absolute and path/inside/the/tar/file is
the relative path to the file inside the archive.
To use the .tar as a directory, you can use /vsizip/path/to/the/file.tar or
/vsitar/path/to/the/file.tar/subdir. Directory listing is available with
VSIReadDir(). A VSIStatL("/vsitar/...") call will return the uncompressed size
of the file. Directories inside the TAR file can be distinguished from regular
files with the VSI_ISDIR(stat.st_mode) macro as for regular file systems.
Getting directory listing and file statistics are fast operations.
Note: in the particular case where the .tar file contains a single file located
at its root, just mentioning "/vsitar/path/to/the/file.tar" will work
Examples:
<pre>
/vsitar/my.tar/my.tif (relative path to the .tar)
/vsitar//home/even/my.tar/subdir/my.tif (absolute path to the .tar)
/vsitar/c:\\users\\even\\my.tar\\subdir\\my.tif
</pre>
Starting with GDAL 2.2, an alternate syntax is available so as to enable
chaining and not being dependent on .tar extension :
/vsitar/{/path/to/the/archive}/path/inside/the/tar/file.
Note that /path/to/the/archive may also itself use this alternate syntax.
\section gdal_virtual_file_systems_network Network based file systems
A generic \ref gdal_virtual_file_systems_vsicurl "/vsicurl/" file system
handler exists for online resources that do
not require particular signed authentication schemes. It is specialized into
sub-filesystems for commercial cloud storage services, such as
\ref gdal_virtual_file_systems_vsis3 "/vsis3/",
\ref gdal_virtual_file_systems_vsigs "/vsigs/",
\ref gdal_virtual_file_systems_vsiaz "/vsiaz/",
\ref gdal_virtual_file_systems_vsioss "/vsioss/" or
\ref gdal_virtual_file_systems_vsiswift "/vsiswift/".
When reading of entire files in a streaming way is possible, prefer using the
\ref gdal_virtual_file_systems_vsicurl_streaming "/vsicurl_streaming/",
and its variants for the above cloud storage services, for more efficiency.
\subsection gdal_virtual_file_systems_vsicurl /vsicurl/ (http/https/ftp files: random access)
/vsicurl/ is a file system handler that allows on-the-fly random reading of
files available through HTTP/FTP web protocols, without prior download of the
entire file. It requires GDAL to be built against libcurl.
Recognized filenames are of the form /vsicurl/http[s]://path/to/remote/resource
or /vsicurl/ftp://path/to/remote/resource where path/to/remote/resource is
the URL of a remote resource.
Example of ogrinfo a shapefile on the internet:
<pre>
ogrinfo -ro -al -so /vsicurl/https://raw.githubusercontent.com/OSGeo/gdal/master/autotest/ogr/data/poly.shp
</pre>
Starting with GDAL 2.3, options can be passed in the filename with the
following syntax:
/vsicurl?[option_i=val_i&]*url=http://...
where each option name and value (including the value of "url") is URL-encoded.
Currently supported options are :
<ul>
<li>use_head=yes/no: whether the HTTP HEAD request can be emitted.
Default to YES.
Setting this option overrides the behaviour of the
CPL_VSIL_CURL_USE_HEAD configuration option.</li>
<li>max_retry=number: default to 0.
Setting this option overrides the behaviour of the
GDAL_HTTP_MAX_RETRY configuration option.</li>
<li>retry_delay=number_in_seconds: default to 30.
Setting this option overrides the behaviour of the
GDAL_HTTP_RETRY_DELAY configuration option.</li>
<li>list_dir=yes/no: whether an attempt to read the file list of the
directory where the file is located should be done. Default to YES.</li>
</ul>
Partial downloads (requires the HTTP server to support random reading) are
done with a 16 KB granularity by default. Starting with GDAL 2.3, the chunk
size can be configured with the CPL_VSIL_CURL_CHUNK_SIZE configuration option,
with a value in bytes. If the driver detects sequential
reading it will progressively increase the chunk size up to 2 MB to improve
download performance. Starting with GDAL 2.3, the GDAL_INGESTED_BYTES_AT_OPEN
configuration option can be set to impose the number of bytes read in one
GET call at file opening (can help performance to read Cloud optimized geotiff
with a large header).
The GDAL_HTTP_PROXY, GDAL_HTTP_PROXYUSERPWD and GDAL_PROXY_AUTH configuration
options can be used to define a proxy server. The syntax to use is the one of
Curl CURLOPT_PROXY, CURLOPT_PROXYUSERPWD and CURLOPT_PROXYAUTH options.
Starting with GDAL 2.1.3, the CURL_CA_BUNDLE or SSL_CERT_FILE configuration
options can be used to set the path to the Certification Authority (CA)
bundle file (if not specified, curl will use a file in a system location).
Starting with GDAL 2.3, additional HTTP headers can be sent by setting the
GDAL_HTTP_HEADER_FILE configuration
option to point to a filename of a text file with "key: value" HTTP headers.
Starting with GDAL 2.3, the GDAL_HTTP_MAX_RETRY (number of attempts) and
GDAL_HTTP_RETRY_DELAY (in seconds) configuration option can be set, so that
request retries are done in case of HTTP errors 429, 502, 503 or 504.
More generally options of CPLHTTPFetch() available through configuration
options are available.
The file can be cached in RAM by setting the configuration option
VSI_CACHE to TRUE. The cache size defaults to 25 MB, but can be
modified by setting the configuration option VSI_CACHE_SIZE (in
bytes). Content in that cache is discarded when the file handle is
closed.
In addition, a global least-recently-used cache of 16 MB shared among all
downloaded content is enabled by default, and content in it may be reused
after a file handle has been closed and reopen, during the life-time of the
process or until VSICurlClearCache() is called. Starting with GDAL 2.3, the
size of this global LRU cache can be modified by setting the configuration
option CPL_VSIL_CURL_CACHE_SIZE (in bytes).
Starting with GDAL 2.3, the
CPL_VSIL_CURL_NON_CACHED configuration option can be set to values like
"/vsicurl/http://example.com/foo.tif:/vsicurl/http://example.com/some_directory",
so that at file handle closing, all cached content related to the mentioned
file(s) is no longer cached. This can help when dealing with resources that
can be modified during execution of GDAL related code. Alternatively,
VSICurlClearCache() can be used.
Starting with GDAL 2.1, /vsicurl/ will try to query directly redirected URLs
to Amazon S3 signed URLs during their validity period, so as to minimize
round-trips. This behaviour can be disabled by setting the configuration
option CPL_VSIL_CURL_USE_S3_REDIRECT to NO.
VSIStatL() will return the size in st_size member and file nature- file or
directory - in st_mode member (the later only reliable with FTP resources for
now).
VSIReadDir() should be able to parse the HTML directory listing returned by
the most popular web servers, such as Apache or Microsoft IIS.
\subsection gdal_virtual_file_systems_vsicurl_streaming /vsicurl_streaming/ (http/https/ftp files: streaming)
/vsicurl_streaming/ is a file system handler that allows on-the-fly sequential
reading of files streamed through HTTP/FTP web protocols, without prior download of the
entire file. It requires GDAL to be built against libcurl.
Although this file handler is able seek to random offsets in the file, this
will not be efficient. If you need efficient random access and that the
server supports range downloading, you should use the /vsicurl/ file system
handler instead.
Recognized filenames are of the form /vsicurl_streaming/http[s]://path/to/remote/resource
or /vsicurl_streaming/ftp://path/to/remote/resource where path/to/remote/resource is
the URL of a remote resource.
The GDAL_HTTP_PROXY, GDAL_HTTP_PROXYUSERPWD and GDAL_PROXY_AUTH configuration
options can be used to define a proxy server. The syntax to use is the one of
Curl CURLOPT_PROXY, CURLOPT_PROXYUSERPWD and CURLOPT_PROXYAUTH options.
Starting with GDAL 2.1.3, the CURL_CA_BUNDLE or SSL_CERT_FILE configuration
options can be used to set the path to the Certification Authority (CA)
bundle file (if not specified, curl will use a file in a system location).
The file can be cached in RAM by setting the configuration option VSI_CACHE
to TRUE. The cache size defaults to 25 MB, but can be modified by setting the
configuration option VSI_CACHE_SIZE (in bytes).
VSIStatL() will return the size in st_size member and file nature- file or
directory - in st_mode member (the later only reliable with FTP resources for
now).
\subsection gdal_virtual_file_systems_vsis3 /vsis3/ (AWS S3 files: random reading)
/vsis3/ is a file system handler that allows on-the-fly random reading of
(primarily non-public) files available in AWS S3 buckets, without prior download of the
entire file. It requires GDAL to be built against libcurl.
It also allows sequential writing of files (no seeks or read operations are
then allowed). Deletion of files with VSIUnlink() is also supported.
Starting with GDAL 2.3, creation of directories with VSIMkdir() and deletion
of (empty) directories with VSIRmdir() are also possible.
Recognized filenames are of the form /vsis3/bucket/key where
bucket is the name of the S3 bucket and key the S3 object "key", i.e.
a filename potentially containing subdirectories.
The generalities of \ref gdal_virtual_file_systems_vsicurl "/vsicurl/" apply.
Several authentication methods are possible. In order of priorities (first
mentioned is the most prioritary)
<ol>
<li>If AWS_NO_SIGN_REQUEST=YES configuration option is set, request signing
is disabled. This option might be used for buckets with public access rights.
Available since GDAL 2.3</li>
<li>The AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID configuration options can
be set. The AWS_SESSION_TOKEN configuration option must be set when
temporary credentials are used.</li>
<li>Starting with GDAL 2.3, alternate ways of providing credentials similar to
what the "aws" command line utility or Boto3 support can be used. If the
above mentioned environment variables are not provided, the ~/.aws/credentials
or %UserProfile%/.aws/credentials file will be read (or the file pointed by
CPL_AWS_CREDENTIALS_FILE). The profile may be
specified with the AWS_PROFILE environment variable (the default profile is "default")</li>
<li>The ~/.aws/config or %UserProfile%/.aws/config file may also be used (or the
file pointer by AWS_CONFIG_FILE) to retrieve credentials and the AWS region.</li>
<li>If none of the above method succeeds, instance profile credentials will be
retrieved when GDAL is used on EC2 instances.</li>
</ol>
The AWS_REGION (or AWS_DEFAULT_REGION
starting with GDAL 2.3) configuration option may be
set to one of the supported
<a href="http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region">S3
regions</a> and defaults to 'us-east-1'.
Starting with GDAL 2.2, the
AWS_REQUEST_PAYER configuration option may be set to "requester" to
facilitate use with
<a href="http://docs.aws.amazon.com/AmazonS3/latest/dev/RequesterPaysBuckets.html">Requester
Pays buckets</a>.
The AWS_S3_ENDPOINT configuration option defaults to s3.amazonaws.com.
On writing, the file is uploaded using the S3
<a href="http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html">multipart upload API</a>.
The size of chunks is set to 50 MB by default, allowing creating files up to
500 GB (10000 parts of 50 MB each). If larger files are needed, then increase
the value of the VSIS3_CHUNK_SIZE config option to a larger value (expressed
in MB). In case the process is killed and the file not properly closed, the
multipart upload will remain open, causing Amazon to charge you for the parts
storage. You'll have to abort yourself with other means such "ghost" uploads
(e.g. with the <a href="http://s3tools.org/s3cmd">s3cmd</a> utility) For
files smaller than the chunk size, a simple PUT request is used instead of
the multipart upload API.
@since GDAL 2.1
\subsection gdal_virtual_file_systems_vsis3_streaming /vsis3_streaming/ (AWS S3 files: streaming)
/vsis3_streaming/ is a file system handler that allows on-the-fly sequential
reading of files (primarily non-public) files available in AWS S3 buckets,
without prior download of the
entire file. It requires GDAL to be built against libcurl.
Recognized filenames are of the form /vsis3_streaming/bucket/key where
bucket is the name of the S3 bucket and resource the S3 object "key", i.e.
a filename potentially containing subdirectories.
Authentication options, and read-only features, are identical to
\ref gdal_virtual_file_systems_vsis3 "/vsis3/"
@since GDAL 2.1
\subsection gdal_virtual_file_systems_vsigs /vsigs/ (Google Cloud Storage files: random reading)
/vsigs/ is a file system handler that allows on-the-fly random reading of
(primarily non-public) files available in Google Cloud Storage buckets,
without prior download of the
entire file. It requires GDAL to be built against libcurl.
Starting with GDAL 2.3, it also allows sequential writing of files (no seeks
or read operations are then allowed). Deletion of files with VSIUnlink(),
creation of directories with VSIMkdir() and deletion of (empty) directories with
VSIRmdir() are also possible.
Recognized filenames are of the form /vsigs/bucket/key where
bucket is the name of the bucket and key the object "key", i.e.
a filename potentially containing subdirectories.
The generalities of \ref gdal_virtual_file_systems_vsicurl "/vsicurl/" apply.
Several authentication methods are possible. In order of priorities (first
mentioned is the most prioritary)
<ol>
<li>The GS_SECRET_ACCESS_KEY and GS_ACCESS_KEY_ID configuration options can be
set for AWS style authentication</li>
<li>The GDAL_HTTP_HEADER_FILE configuration
option to point to a filename of a text file with "key: value" headers.
Typically, it must contain a "Authorization: Bearer XXXXXXXXX" line.</li>
<li>(GDAL >= 2.3) The GS_OAUTH2_REFRESH_TOKEN
configuration option can be set to use OAuth2 client authentication.
See http://code.google.com/apis/accounts/docs/OAuth2.html
This refresh token can be obtained with the "gdal_auth.py -s storage" or
"gdal_auth.py -s storage-rw" script
Note: instead of using the default GDAL application credentials, you may
define the GS_OAUTH2_CLIENT_ID and GS_OAUTH2_CLIENT_SECRET configuration
options (need to be defined both for gdal_auth.py and later execution of /vsigs)
</li>
<li>(GDAL >= 2.3) The GOOGLE_APPLICATION_CREDENTIALS configuration option
an be set to point to a JSon file containing OAuth2 service account credentials,
in particular a private key and a client email.
See https://developers.google.com/identity/protocols/OAuth2ServiceAccount
for more details on this authentication method.
The bucket must grant the "Storage Legacy Bucket Owner" or "Storage Legacy Bucket Reader"
permissions to the service account. The GS_OAUTH2_SCOPE configuration option
can be set to change the default permission scope from
"https://www.googleapis.com/auth/devstorage.read_write"
to "https://www.googleapis.com/auth/devstorage.read_only" if needed.
<li>(GDAL >= 2.3) Variant of the previous method.
The GS_OAUTH2_PRIVATE_KEY (or GS_OAUTH2_PRIVATE_KEY_FILE)
and GS_OAUTH2_CLIENT_EMAIL can be set to use OAuth2 service account authentication.
See https://developers.google.com/identity/protocols/OAuth2ServiceAccount
for more details on this authentication method.
The GS_OAUTH2_PRIVATE_KEY configuration option must contain the private key
as a inline string, starting with "-----BEGIN PRIVATE KEY-----"
Alternatively the GS_OAUTH2_PRIVATE_KEY_FILE configuration option can be set
to indicate a filename that contains such a private key.
The bucket must grant the "Storage Legacy Bucket Owner" or "Storage Legacy Bucket Reader"
permissions to the service account. The GS_OAUTH2_SCOPE configuration option
can be set to change the default permission scope from
"https://www.googleapis.com/auth/devstorage.read_write"
to "https://www.googleapis.com/auth/devstorage.read_only" if needed.
<li>(GDAL >= 2.3) An alternate way of providing credentials similar to
what the "gsutil" command line utility or Boto3 support can be used. If the
above mentioned environment variables are not provided, the ~/.boto
or %UserProfile%/.boto file will be read (or the file pointed by
CPL_GS_CREDENTIALS_FILE) for the gs_secret_access_key and gs_access_key_id
entries for AWS style authentication. If not found, it will look for the
gs_oauth2_refresh_token (and optionally client_id and client_secret) entry
for OAuth2 client authentication.</li>
<li>(GDAL >= 2.3) Finally if none of the above method succeeds, the code
will check if the current machine is a Google Compute Engine instance, and
if so will use the permissions associated to it (using the default service
account associated with the VM). To force a machine to be detected as a GCE instance
(for example for code running in a container with no access to the boot logs), you
can set CPL_MACHINE_IS_GCE to YES.</li>
</ol>
@since GDAL 2.2
\subsection gdal_virtual_file_systems_vsigs_streaming /vsigs_streaming/ (Google Cloud Storage files: streaming)
/vsigs_streaming/ is a file system handler that allows on-the-fly sequential
reading of files (primarily non-public) files available in Google Cloud Storage
buckets, without prior download of the
entire file. It requires GDAL to be built against libcurl.
Recognized filenames are of the form /vsigs_streaming/bucket/key where
bucket is the name of the bucket and key the object "key", i.e.
a filename potentially containing subdirectories.
Authentication options, and read-only features, are identical to
\ref gdal_virtual_file_systems_vsigs "/vsigs/"
@since GDAL 2.2
\subsection gdal_virtual_file_systems_vsiaz /vsiaz/ (Microsoft Azure Blob files: random reading)
/vsiaz/ is a file system handler that allows on-the-fly random reading of
(primarily non-public) files available in Microsoft Azure Blob containers,
without prior download of the entire file.
It requires GDAL to be built against libcurl.
It also allows sequential writing of files (no seeks
or read operations are then allowed). A block blob will be created if the
file size is below 4 MB. Beyond, an append blob will be created (with a
maximum file size of 195 GB).
Deletion of files with VSIUnlink(), creation of directories with VSIMkdir()
and deletion of (empty) directories with VSIRmdir() are also possible.
Note: when using VSIMkdir(), a special hidden .gdal_marker_for_dir empty
file is created, since Azure Blob does not support natively empty directories.
If that file is the last one remaining in a directory, VSIRmdir() will
automatically remove it. This file will not be seen with VSIReadDir().
If removing files from directories not created with VSIMkdir(), when the
last file is deleted, its directory is automatically removed by Azure, so
the sequence VSIUnlink("/vsiaz/container/subdir/lastfile") followed by
VSIRmdir("/vsiaz/container/subdir") will fail on the VSIRmdir() invocation.
Recognized filenames are of the form /vsiaz/container/key where
container is the name of the container and key the object "key", i.e.
a filename potentially containing subdirectories.
The generalities of \ref gdal_virtual_file_systems_vsicurl "/vsicurl/" apply.
Several authentication methods are possible. In order of priorities (first
mentioned is the most prioritary)
<ol>
<li>The AZURE_STORAGE_CONNECTION_STRING configuration option, given in the
access key section of the administration interface. It contains both the
account name and a secret key.
</li>
<li>The AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_ACCESS_KEY configuration
options pointing respectively to the account name and a secret key.
</li>
</ol>
@since GDAL 2.3
\subsection gdal_virtual_file_systems_vsiaz_streaming /vsiaz_streaming/ (Microsoft Azure Blob files: streaming)
/vsiaz_streaming/ is a file system handler that allows on-the-fly sequential
reading of files (primarily non-public) files available in Microsoft Azure Blob containers,
buckets, without prior download of the
entire file. It requires GDAL to be built against libcurl.
Recognized filenames are of the form /vsiaz_streaming/container/key where
container is the name of the container and key the object "key", i.e.
a filename potentially containing subdirectories.
Authentication options, and read-only features, are identical to
\ref gdal_virtual_file_systems_vsiaz "/vsiaz/"
@since GDAL 2.3
\subsection gdal_virtual_file_systems_vsioss /vsioss/ (Alibaba Cloud OSS files: random reading)
/vsioss/ is a file system handler that allows on-the-fly random reading of
(primarily non-public) files available in Alibaba Cloud Object Storage Service
(OSS) buckets, without prior download of the entire file.
It requires GDAL to be built against libcurl.
It also allows sequential writing of files (no seeks or read operations are
then allowed). Deletion of files with VSIUnlink() is also supported.
Creation of directories with VSIMkdir() and deletion
of (empty) directories with VSIRmdir() are also possible.
Recognized filenames are of the form /vsioss/bucket/key where
bucket is the name of the OSS bucket and key the OSS object "key", i.e.
a filename potentially containing subdirectories.
The generalities of \ref gdal_virtual_file_systems_vsicurl "/vsicurl/" apply.
The OSS_SECRET_ACCESS_KEY and OSS_ACCESS_KEY_ID configuration options *must*
be set. The OSS_ENDPOINT configuration option should normally be set to the
appropriate value, which reflects the region attached to the bucket.
The default is oss-us-east-1.aliyuncs.com. If the bucket is
stored in another region than oss-us-east-1, the code logic will redirect to
the appropriate endpoint.
On writing, the file is uploaded using the OSS
<a href="https://www.alibabacloud.com/help/doc-detail/31991.htm?spm=a3c0i.o31982en.b99.324.402280483cYSv0">multipart upload API</a>.
The size of chunks is set to 50 MB by default, allowing creating files up to
500 GB (10000 parts of 50 MB each). If larger files are needed, then increase
the value of the VSIOSS_CHUNK_SIZE config option to a larger value (expressed
in MB). In case the process is killed and the file not properly closed, the
multipart upload will remain open, causing Alibaba to charge you for the parts
storage. You'll have to abort yourself with other means. For
files smaller than the chunk size, a simple PUT request is used instead of
the multipart upload API.
@since GDAL 2.3
\subsection gdal_virtual_file_systems_vsioss_streaming /vsioss_streaming/ (Alibaba Cloud OSS files: streaming)
/vsioss_streaming/ is a file system handler that allows on-the-fly sequential
reading of files (primarily non-public) files available in Alibaba Cloud Object
Storage Service (OSS) buckets, without prior download of the
entire file. It requires GDAL to be built against libcurl.
Recognized filenames are of the form /vsioss_streaming/bucket/key where
bucket is the name of the bucket and key the object "key", i.e.
a filename potentially containing subdirectories.
Authentication options, and read-only features, are identical to
\ref gdal_virtual_file_systems_vsioss "/vsioss/"
@since GDAL 2.3
\subsection gdal_virtual_file_systems_vsiswift /vsiswift/ (OpenStack Swift Object Storage: random reading)
/vsiswift/ is a file system handler that allows on-the-fly random reading of
(primarily non-public) files available in
<a href="https://developer.openstack.org/api-ref/object-store/">OpenStack Swift Object Storage</a>
(swift) buckets, without prior download of the entire file.
It requires GDAL to be built against libcurl.
It also allows sequential writing of files (no seeks or read operations are
then allowed). Deletion of files with VSIUnlink() is also supported.
Creation of directories with VSIMkdir() and deletion
of (empty) directories with VSIRmdir() are also pswiftible.
Recognized filenames are of the form /vsiswift/bucket/key where
bucket is the name of the swift bucket and key the swift object "key", i.e.
a filename potentially containing subdirectories.
The generalities of \ref gdal_virtual_file_systems_vsicurl "/vsicurl/" apply.
Two authentication methods are possible. In order of priorities (first
mentioned is the most prioritary)
<ol>
<li>The SWIFT_STORAGE_URL and SWIFT_AUTH_TOKEN configuration options are set
respectively to the storage URL (e.g http://127.0.0.1:12345/v1/AUTH_something)
and the value of the x-auth-token authorization token.
</li>
<li>The SWIFT_AUTH_V1_URL, SWIFT_USER and SWIFT_KEY configuration options are
set respectively to the endpoint of the Auth V1 authentication
(e.g http://127.0.0.1:12345/auth/v1.0), the user name and the key/password.
This authentication endpoint will be used to retrieve the storage URL and
authorization token mentioned in the first authentication method.
</li>
</ol>
This file system handler also allows sequential writing of files (no seeks
or read operations are then allowed)
@since GDAL 2.3
\subsection gdal_virtual_file_systems_vsiswift_streaming /vsiswift_streaming/ (OpenStack Swift Object Storage: streaming)
/vsiswift_streaming/ is a file system handler that allows on-the-fly sequential
reading of files (primarily non-public) files available in
<a href="https://developer.openstack.org/api-ref/object-store/">OpenStack Swift Object Storage</a>
(swift) buckets, without prior download of the entire file.
It requires GDAL to be built against libcurl.
Recognized filenames are of the form /vsiswift_streaming/bucket/key where
bucket is the name of the bucket and key the object "key", i.e.
a filename potentially containing subdirectories.
Authentication options, and read-only features, are identical to
\ref gdal_virtual_file_systems_vsiswift "/vsiswift/"
@since GDAL 2.3
\subsection gdal_virtual_file_systems_vsihdfs /vsihdfs/ (Hadoop File System)
/vsihdfs/ is a file system handler that provides read access to HDFS.
This handler requires GDAL to have been built with Java support
(--with-java) and HDFS support (--with-hdfs). Support for this
handler is currently only available on Unix-like systems.
Recognized filenames are of the form /vsihdfs/hdfsUri where hdfsUri is
a valid HDFS URI.
Examples:
<pre>
/vsihdfs/file:/tmp/my.tif (a local file accessed through HDFS)
/vsihdfs/hdfs:/hadoop/my.tif (a file stored in HDFS)
</pre>
@since GDAL 2.4
\section gdal_virtual_file_systems_vsistdin /vsistdin/ (standard input streaming)
/vsistdin/ is a file handler that allows reading from the standard input stream.
The filename syntax must be only "/vsistdin/"
The file operations available are of course limited to Read() and forward Seek().
Full seek in the first MB of a file is possible, and it is cached so that
closing, re-opening /vsistdin/ and reading within thist first megabyte, is
possible multiple times in the same process.
\section gdal_virtual_file_systems_vsistdout /vsistdout/ (standard output streaming)
/vsistdout/ is a file handler that allows writing into the standard output stream.
The filename syntax must be only "/vsistdout/"
The file operations available are of course limited to Write().
A variation of this file system exists as the /vsistdout_redirect/ file
system handler, where the output function can be defined with
VSIStdoutSetRedirection().
\section gdal_virtual_file_systems_vsimem /vsimem/ (in-memory files)
/vsimem/ is a file handler that allows block of memory to be
treated as files. All portions of the file system underneath the base
path "/vsimem/" will be handled by this driver.
Normal VSI*L functions can be used freely to create and destroy memory
arrays treating them as if they were real file system objects. Some
additional methods exist to efficient create memory file system objects
without duplicating original copies of the data or to "steal" the block
of memory associated with a memory file. See VSIFileFromMemBuffer() and
VSIGetMemFileBuffer()
Directory related functions are supported.
/vsimem/ files are visible within the same process. Multiple threads can
access in reading to the same underlying file, provided they used different
handles, but concurrent write and read operations on the same underlying file
are not supported (locking is left to the responsibility of calling code)
\section gdal_virtual_file_systems_subfile /vsisubfile/ (portions of files)
The /vsisubfile/ virtual file system handler allows access to subregions of
files, treating them as a file on their own to the virtual file
system functions (VSIFOpenL(), etc).
A special form of the filename is used to indicate a subportion
of another file:
/vsisubfile/<offset>[_<size>],<filename>
The size parameter is optional. Without it the remainder of the file from
the start offset as treated as part of the subfile. Otherwise only
<size> bytes from <offset> are treated as part of the subfile.
The <filename> portion may be a relative or absolute path using normal
rules. The <offset> and <size> values are in bytes.
eg.
/vsisubfile/1000_3000,/data/abc.ntf
/vsisubfile/5000,../xyz/raw.dat
Unlike the /vsimem/ or conventional file system handlers, there
is no meaningful support for filesystem operations for creating new
files, traversing directories, and deleting files within the /vsisubfile/
area. Only the VSIStatL(), VSIFOpenL() and operations based on the file
handle returned by VSIFOpenL() operate properly.
\section gdal_virtual_file_systems_vsisparse /vsisparse/ (sparse files)
The /vsisparse/ virtual file handler allows a virtual file to be composed
from chunks of data in other files, potentially with large spaces in
the virtual file set to a constant value. This can make it possible to
test some sorts of operations on what seems to be a large file with
image data set to a constant value. It is also helpful when wanting to
add test files to the test suite that are too large, but for which most
of the data can be ignored. It could, in theory, also be used to
treat several files on different file systems as one large virtual file.
The file referenced by /vsisparse/ should be an XML control file
formatted something like:
\verbatim
<VSISparseFile>
<Length>87629264</Length>
<SubfileRegion> <!-- Stuff at start of file. -->
<Filename relative="1">251_head.dat</Filename>
<DestinationOffset>0</DestinationOffset>
<SourceOffset>0</SourceOffset>
<RegionLength>2768</RegionLength>
</SubfileRegion>
<SubfileRegion> <!-- RasterDMS node. -->
<Filename relative="1">251_rasterdms.dat</Filename>
<DestinationOffset>87313104</DestinationOffset>
<SourceOffset>0</SourceOffset>
<RegionLength>160</RegionLength>
</SubfileRegion>
<SubfileRegion> <!-- Stuff at end of file. -->
<Filename relative="1">251_tail.dat</Filename>
<DestinationOffset>87611924</DestinationOffset>
<SourceOffset>0</SourceOffset>
<RegionLength>17340</RegionLength>
</SubfileRegion>
<ConstantRegion> <!-- Default for the rest of the file. -->
<DestinationOffset>0</DestinationOffset>
<RegionLength>87629264</RegionLength>
<Value>0</Value>
</ConstantRegion>
</VSISparseFile>
\endverbatim
Hopefully the values and semantics are fairly obvious.
\section gdal_virtual_file_systems_vsicache File caching
This is not a proper virtual file system handler, but a C function that
takes a virtual file handle and returns a new handle that caches read-operations
on the input file handle. The cache is RAM based and the content of the cache is
discarded when the file handle is closed. The cache is a least-recently used
lists of blocks of 32KB each.
The VSICachedFile class only handles read operations at that time, and will
error out on write operations.
This is done with the VSICreateCachedFile() function, that is implictly used
by a number of the above mentioned file systems (namely the default one for
standard file system operations, and the /vsicurl/ and other related network
file systems) if the VSI_CACHE configuration option is set to YES.
The default size of caching for each file is 25 MB (25 MB
for each file that is cached), and can be controlled with the VSI_CACHE_SIZE
configuration option (value in bytes).
\section gdal_virtual_file_systems_vsicrypt /vsicrypt/ (encrypted files)
/vsicrypt/ is a special file handler is installed that allows reading/creating/update
encrypted files on the fly, with random access capabilities.
Refert to VSIInstallCryptFileHandler() for more details.
*/