In [1]:
%%bash
du -hcs qiceradar_arctic_index.gpkg

18M	qiceradar_arctic_index.gpkg
18M	total


Start by trying command-line tools to look at the dataset:

In [7]:
%%bash 
ogrinfo qiceradar_arctic_index.gpkg

INFO: Open of `qiceradar_arctic_index.gpkg'
      using driver `GPKG' successful.
1: 1993_Greenland_P3 (Line String)
2: 1995_Greenland_P3 (Line String)
3: 1996_Greenland_P3 (Line String)
4: 1997_Greenland_P3 (Line String)
5: 1998_Greenland_P3 (Line String)
6: 1999_Greenland_P3 (Line String)
7: 2001_Greenland_P3 (Line String)
8: 2002_Greenland_P3 (Line String)
9: 2003_Greenland_P3 (Line String)
10: 2005_Greenland_TO (Line String)
11: 2006_Greenland_TO (Line String)
12: 2007_Greenland_P3 (Line String)
13: 2008_Greenland_Ground (Line String)
14: 2008_Greenland_TO (Line String)
15: 2009_Greenland_TO (Line String)
16: 2010_Greenland_DC8 (Line String)
17: 2010_Greenland_P3 (Line String)
18: 2011_Greenland_P3 (Line String)
19: 2011_Greenland_TO (Line String)
20: 2012_Greenland_P3 (Line String)
21: 2013_Greenland_P3 (Line String)
22: 2014_Greenland_P3 (Line String)
23: 2015_Greenland_C130 (Line String)
24: 2016_Greenland_G1XB (Line String)
25: 2016_Greenland_P3 (Line String)
26: 2016_Greenland

OK ... that gave a table of contents, but I want to know more =)
* How many elements go into each {yyyy}_Greenland_P3 line string?
* What's the total length of all flightlines?

In [11]:
import geopandas as gpd
data = gpd.read_file("qiceradar_arctic_index.gpkg")
data

Unnamed: 0,institution,region,campaign,segment,granule,availability,uri,name,geometry
0,CRESIS,arctic,1993_Greenland_P3,19930623_01,Data_19930623_01_001,s,,CRESIS_1993_Greenland_P3_Data_19930623_01_001,"LINESTRING (-171256.100 -2465362.576, -166395...."
1,CRESIS,arctic,1993_Greenland_P3,19930623_01,Data_19930623_01_002,s,,CRESIS_1993_Greenland_P3_Data_19930623_01_002,"LINESTRING (-68987.776 -2385310.566, -63658.65..."
2,CRESIS,arctic,1993_Greenland_P3,19930623_01,Data_19930623_01_003,s,,CRESIS_1993_Greenland_P3_Data_19930623_01_003,"LINESTRING (50726.369 -2314756.736, 51196.739 ..."
3,CRESIS,arctic,1993_Greenland_P3,19930623_01,Data_19930623_01_004,s,,CRESIS_1993_Greenland_P3_Data_19930623_01_004,"LINESTRING (68815.712 -2307761.615, 69420.398 ..."
4,CRESIS,arctic,1993_Greenland_P3,19930623_01,Data_19930623_01_005,s,,CRESIS_1993_Greenland_P3_Data_19930623_01_005,"LINESTRING (87970.745 -2300323.116, 89249.635 ..."
...,...,...,...,...,...,...,...,...,...
222,CRESIS,arctic,1993_Greenland_P3,19930709_01,Data_19930709_01_030,s,,CRESIS_1993_Greenland_P3_Data_19930709_01_030,"LINESTRING (-63698.183 -2478689.898, -63643.02..."
223,CRESIS,arctic,1993_Greenland_P3,19930709_01,Data_19930709_01_031,s,,CRESIS_1993_Greenland_P3_Data_19930709_01_031,"LINESTRING (-61651.026 -2532054.133, -61569.13..."
224,CRESIS,arctic,1993_Greenland_P3,19930709_01,Data_19930709_01_032,s,,CRESIS_1993_Greenland_P3_Data_19930709_01_032,"LINESTRING (-61110.621 -2544821.872, -60876.78..."
225,CRESIS,arctic,1993_Greenland_P3,19930709_01,Data_19930709_01_033,s,,CRESIS_1993_Greenland_P3_Data_19930709_01_033,"LINESTRING (-57050.273 -2589403.358, -57730.13..."


It looks like this only grabbed data from the first campaign, not everything in the package?

Let's try using sqlite to directly inspect the database

In [12]:
import sqlite3

In [17]:
campaign_names = set()
institutions = {}
campaigns = {}

with sqlite3.connect("qiceradar_arctic_index.gpkg") as conn:
    conn.row_factory = sqlite3.Row
    cursor = conn.execute("SELECT * FROM {}".format("gpkg_geometry_columns"))
    for row in cursor:
        campaign_names.add(row["table_name"])  # I think this is also the primary key
        

    for campaign in campaign_names:
        campaigns[campaign] = 0
        cursor = conn.execute("SELECT * FROM '{}'".format(campaign))
        # NOTE: With the current database design, these tables only have one row.
        for row in cursor:
            institution = row["institution"]
            if institution not in institutions:
                institutions[institution] = set()
            institutions[institution].add(campaign)
            campaigns[campaign] += 1

In [21]:
for ii, cc in institutions.items():
    print("{} released data from {} campaigns".format(ii, len(cc)))

CRESIS released data from 30 campaigns
UTIG released data from 1 campaigns


In [27]:
ccs = list(campaigns.keys())
ccs.sort()
for cc in ccs:
    print("{} had {} granules".format(cc, campaigns[cc]))

1993_Greenland_P3 had 227 granules
1995_Greenland_P3 had 172 granules
1996_Greenland_P3 had 52 granules
1997_Greenland_P3 had 217 granules
1998_Greenland_P3 had 216 granules
1999_Greenland_P3 had 272 granules
2001_Greenland_P3 had 106 granules
2002_Greenland_P3 had 481 granules
2003_Greenland_P3 had 155 granules
2005_Greenland_TO had 114 granules
2006_Greenland_TO had 445 granules
2007_Greenland_P3 had 136 granules
2008_Greenland_Ground had 5 granules
2008_Greenland_TO had 706 granules
2009_Greenland_TO had 425 granules
2010_Greenland_DC8 had 340 granules
2010_Greenland_P3 had 585 granules
2011_Greenland_P3 had 1669 granules
2011_Greenland_TO had 284 granules
2012_Greenland_P3 had 2010 granules
2013_Greenland_P3 had 878 granules
2014_Greenland_P3 had 1864 granules
2015_Greenland_C130 had 1558 granules
2016_Greenland_G1XB had 29 granules
2016_Greenland_P3 had 552 granules
2016_Greenland_Polar6 had 62 granules
2016_Greenland_TOdtu had 90 granules
2017_Greenland_P3 had 1699 granules
2018_

Can we use the processing toolbox to get the length of all the linestrings?