# Part 4: Using Geopandas to Read and Write Vector Data

<img width="20%" src="https://geopandas.readthedocs.io/en/latest/_static/geopandas_logo_web.svg"></img>

GeoPandas is a popular Python library designed to simplify working with geospatial data in Python by extending the capabilities of the pandas library. It provides data structures and functions needed to manipulate and analyze geospatial data, such as points, lines, and polygons, and to perform various spatial operations, like spatial joins, overlays, and projections.

GeoPandas builds upon several core Python libraries, including pandas, Shapely, Fiona, and pyproj. These dependencies provide the underlying functionality for handling geospatial data structures, file I/O, and coordinate transformations.

http://geopandas.readthedocs.io

We can import geopandas now. Most developers import it as "gpd" to type less.

In [1]:
import geopandas as gpd

## Reading Natural Earth Dataset

In a previous part we downloaded the natural earth vector dataset and looked at the airports and countries datasets. 

Let's do that again but this time using GeoPandas

In [2]:
filename = "geodata/packages/natural_earth_vector.gpkg"

airports = gpd.read_file(filename, layer="ne_10m_airports")

the variable "airports" is a "geopandas data frame". We can just display it in jupyter lab:

The last column is "geometry" and contains the geometry. Compared to fiona this is really easy!

In [3]:
airports

Unnamed: 0,scalerank,featurecla,type,name,abbrev,location,gps_code,iata_code,wikipedia,natlscale,...,name_vi,name_zh,wdid_score,ne_id,name_fa,name_he,name_uk,name_ur,name_zht,geometry
0,9,Airport,small,Sahnewal,LUH,terminal,VILD,LUH,http://en.wikipedia.org/wiki/Sahnewal_Airport,8.0,...,,,4.0,1159113785,فرودگاه سهنول,,,,,POINT (75.95707 30.85036)
1,9,Airport,mid,Solapur,SSE,terminal,VASL,SSE,http://en.wikipedia.org/wiki/Solapur_Airport,8.0,...,,,4.0,1159113803,فرودگاه سولاپور,,,,,POINT (75.93306 17.62542)
2,9,Airport,mid,Birsa Munda,IXR,terminal,VERC,IXR,http://en.wikipedia.org/wiki/Birsa_Munda_Airport,8.0,...,Sân bay Birsa Munda,比尔萨·蒙达机场,4.0,1159113831,فرودگاه بیرسا موندا,,,,比爾薩·蒙達機場,POINT (85.32360 23.31772)
3,9,Airport,mid,Ahwaz,AWZ,terminal,OIAW,AWZ,http://en.wikipedia.org/wiki/Ahwaz_Airport,8.0,...,Sân bay Ahvaz,阿瓦士机场,4.0,1159113845,فرودگاه بین المللی اهواز,,,,阿瓦士機場,POINT (48.74711 31.34316)
4,9,Airport,mid and military,Gwalior,GWL,terminal,VIGR,GWL,http://en.wikipedia.org/wiki/Gwalior_Airport,8.0,...,Sân bay Gwalior,辛迪亚航空站,4.0,1159113863,فرودگاه گوالیور,,,,辛迪亞航空站,POINT (78.21722 26.28549)
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
888,2,Airport,major,Arlanda,ARN,terminal,ESSA,ARN,http://en.wikipedia.org/wiki/Stockholm-Arlanda...,150.0,...,Sân bay Stockholm-Arlanda,斯德哥尔摩－阿兰达机场,4.0,1159127877,فرودگاه استکهلم-آرلاندا,נמל התעופה סטוקהולם ארלנדה,Стокгольм-Арланда,اسٹاک ہوم ارلینڈا ہوائی اڈا,斯德哥爾摩－阿蘭達機場,POINT (17.93073 59.65112)
889,2,Airport,major,Soekarno-Hatta Int'l,CGK,parking,WIII,CGK,http://en.wikipedia.org/wiki/Soekarno-Hatta_In...,150.0,...,Sân bay quốc tế Soekarno-Hatta,苏加诺－哈达国际机场,4.0,1159127891,فرودگاه بینالمللی سوئکارنو-هتا,נמל התעופה סואקרנו-האטה,Сукарно-Хатта,سوکارنو-ہاتا بین الاقوامی ہوائی اڈا,蘇加諾－哈達國際機場,POINT (106.65430 -6.12660)
890,2,Airport,major,Eleftherios Venizelos Int'l,ATH,terminal,LGAV,ATH,http://en.wikipedia.org/wiki/Athens_Internatio...,150.0,...,Sân bay quốc tế Athena,雅典埃莱夫塞里奥斯·韦尼泽洛斯国际机场,2.0,1159127903,فرودگاه بینالمللی آتن,נמל התעופה הבינלאומי אתונה-אלפתריוס וניזלוס,Міжнародний аеропорт «Елефтеріос Венізелос»,,雅典埃萊夫塞里奧斯·韋尼澤洛斯國際機場,POINT (23.94712 37.93623)
891,2,Airport,major,Tokyo Int'l,HND,terminal,RJTT,HND,https://en.wikipedia.org/wiki/Haneda_Airport,150.0,...,Sân bay quốc tế Tokyo,東京國際機場,,1729942773,فرودگاه هانهدا,נמל התעופה טוקיו האנדה,Міжнародний аеропорт Токіо,ہانیدا ہوائی اڈا,東京國際機場,POINT (139.78405 35.54906)


There are 41 columns and 893 rows. The geodataframe has an attribute "shape", where we can also get this information:

In [4]:
airports.shape

(893, 41)

We can create a new dataframe with less rows by just telling which rows we want. We should always keep the **geometry** row.

In [5]:
airports2 = airports[['scalerank', 'type', 'name','iata_code', 'geometry']] 

In [6]:
airports2

Unnamed: 0,scalerank,type,name,iata_code,geometry
0,9,small,Sahnewal,LUH,POINT (75.95707 30.85036)
1,9,mid,Solapur,SSE,POINT (75.93306 17.62542)
2,9,mid,Birsa Munda,IXR,POINT (85.32360 23.31772)
3,9,mid,Ahwaz,AWZ,POINT (48.74711 31.34316)
4,9,mid and military,Gwalior,GWL,POINT (78.21722 26.28549)
...,...,...,...,...,...
888,2,major,Arlanda,ARN,POINT (17.93073 59.65112)
889,2,major,Soekarno-Hatta Int'l,CGK,POINT (106.65430 -6.12660)
890,2,major,Eleftherios Venizelos Int'l,ATH,POINT (23.94712 37.93623)
891,2,major,Tokyo Int'l,HND,POINT (139.78405 35.54906)


we can also call .head(n) to display the first n entries:

In [7]:
airports2.head(5)

Unnamed: 0,scalerank,type,name,iata_code,geometry
0,9,small,Sahnewal,LUH,POINT (75.95707 30.85036)
1,9,mid,Solapur,SSE,POINT (75.93306 17.62542)
2,9,mid,Birsa Munda,IXR,POINT (85.32360 23.31772)
3,9,mid,Ahwaz,AWZ,POINT (48.74711 31.34316)
4,9,mid and military,Gwalior,GWL,POINT (78.21722 26.28549)


### Sorting

Sorting is very easy. You must specify which column is sorted

In [8]:
airports2.sort_values(by="name", ascending=True)

Unnamed: 0,scalerank,type,name,iata_code,geometry
448,6,mid,Aba Tenna D. Yilma Int'l,DIR,POINT (41.85776 9.61268)
21,9,mid and military,Abdul Rachman Saleh,MLG,POINT (112.71142 -7.92998)
626,4,mid,Abidjan Port Bouet,ABJ,POINT (-3.93222 5.25440)
554,6,major,Abu Dhabi Int'l,AUH,POINT (54.64633 24.42723)
565,5,major,Abuja Int'l,ABV,POINT (7.27026 9.00438)
...,...,...,...,...,...
308,7,major,Zvartnots Int'l,EVN,POINT (44.40006 40.15237)
381,7,major,Ürümqi Diwopu Int'l,URC,POINT (87.46713 43.89834)
231,8,mid,Łódź Władysław Reymont,LCJ,POINT (19.40321 51.72721)
49,8,major,Şakirpaşa,ADA,POINT (35.29696 36.98521)


### Queries

We can do some queries:

In [9]:
airports2.query("scalerank == 2")

Unnamed: 0,scalerank,type,name,iata_code,geometry
828,2,major,Hong Kong Int'l,HKG,POINT (113.93502 22.31533)
829,2,major,Taoyuan,TPE,POINT (121.23137 25.07674)
830,2,major,Schiphol,AMS,POINT (4.76438 52.30893)
831,2,major,Singapore Changi,SIN,POINT (103.98641 1.35616)
832,2,major,London Heathrow,LHR,POINT (-0.45316 51.47100)
...,...,...,...,...,...
888,2,major,Arlanda,ARN,POINT (17.93073 59.65112)
889,2,major,Soekarno-Hatta Int'l,CGK,POINT (106.65430 -6.12660)
890,2,major,Eleftherios Venizelos Int'l,ATH,POINT (23.94712 37.93623)
891,2,major,Tokyo Int'l,HND,POINT (139.78405 35.54906)


In [10]:
airports2.query("iata_code == 'ZRH'  ")

Unnamed: 0,scalerank,type,name,iata_code,geometry
823,3,major,Zurich Int'l,ZRH,POINT (8.56221 47.45239)
