# Lists of Species by Country 

This code uses the effechecka API to get a list of taxa that have been reported in each country. The API taxes a polygon (points are lat/lon coordinates) and returns observations within that polygon from several species occurrance databases. To use this notebook, you need a list of geonames ids and a json file with geonames polygons.

In the cell below, we import the necessary libraries and data files. The input file, test_country.txt, contains the geonames ID for the country. There is only two in the file at any given time. This helps to avoid overloading the server. The code is written so that more countries can be included if in the future the server can handle more queries at once. The file low_res_countries.json are the polygons from geonames that have been reduced in resolution so they can fit in the URL API call.

In [75]:
import urllib.request
import urllib.error
import json

in_file = open('test_country.txt', 'r')
shape_file = open('low_res_countries.json','r')

shapes = json.load(shape_file)

The code below takes the country-shaped polygon and forms the URL to query the API. Each query will take two hits. The first gets effechecka started on the query and the second (done a day later) will grab the results. If the query has been submitted before, then you will not need to do the second query. The json results returned by the API are written to the out_files. Each country has a separate out_file.

In [76]:
out_files = ['output1.tsv','output2.tsv','output3.tsv']
#all of the code to line 47 is about reading the input json and forming the URL for the API query
for index, line in enumerate(in_file):
    line = line.strip()
    row = line.split('\t')
    geonamesid = row[2]
    iso = row[1]
    print(geonamesid) #print the id so you know what country you are on
    country = row[0]
    polygons = shapes['features']
    for polygon in polygons:
        geoid = polygon['properties']['geoNameId']
        if geonamesid == geoid: #use the geonames id to find the right polygon in the shapes file
            shape_type = polygon['geometry']['type']
            if shape_type == 'Polygon': #some country polygons are multiple polygons. Need a different procedure
                p = []
                wkt = polygon['geometry']['coordinates'][0]
                for i in wkt:
                    z = []
                    lat = i[1]
                    lon = i[0]
                    z.append(str(lon))
                    z.append(str(lat))
                    m = '%20'.join(z)
                    p.append(str(m))
                q = '%2C%20'.join(p)
                url = 'http://api.effechecka.org/checklist.tsv?traitSelector=&wktString=POLYGON((' + str(q) + '))'
                z = 'POLYGON((' + str(q) + '))'
            elif shape_type == 'MultiPolygon':
                q = ''
                url = 'http://api.effechecka.org/checklist.tsv?traitSelector=&wktString=GEOMETRYCOLLECTION%28POLYGON%20%28%28'
                wkt = polygon['geometry']['coordinates']
                for k in wkt:
                    k = k[0]
                    if len(k) == 0: #the process of shortening the polygons left a lot of blank coordinates. They get removed here.
                        continue
                    p = []
                    for i in k:
                        z = []
                        for j in i:
                            z.append(str(j))
                        m = '%20'.join(z)
                        p.append(str(m))
                    q = q + '%2C%20'.join(p) + '%29%29%2CPOLYGON%20%28%28'
                url = url + q
                url = url.strip('%2CPOLYGON%20%28%28')
                url = url + '%29'
                z = 'GEOMETRYCOLLECTION%28POLYGON%20%28%28' + q.strip('%2CPOLYGON%20%28%28')
                z = z + '%29'
            print(url)
            try: urllib.request.urlretrieve(url, out_files[index]) #This is where the url is submitted to the API and results are read
            except urllib.error.URLError as e:
                 print(e.reason)
            with open(out_files[index], 'a') as u:
                u.write('\ncountry\t' + country + '\n')
                u.write('country_uri\t' + geonamesid + '\n')
                u.write('polygon\t' + z + '\n')
                u.close()
print('complete') #make sure the code gets to the end
            

99237
http://api.effechecka.org/checklist.tsv?traitSelector=&wktString=GEOMETRYCOLLECTION%28POLYGON%20%28%2848.567%2029.916%2C%2048.21%2030.033%2C%2047.95%2030.061%2C%2047.709%2030.096%2C%2047.181%2030.026%2C%2046.556%2029.103%2C%2044.72%2029.206%2C%2043.611%2030.022%2C%2041.444%2031.378%2C%2039.203%2032.158%2C%2039.26%2032.356%2C%2038.986%2032.478%2C%2038.796%2033.368%2C%2041.238%2034.785%2C%2041.283%2035.486%2C%2041.381%2035.835%2C%2041.295%2036.356%2C%2041.828%2036.593%2C%2042.364%2037.109%2C%2042.799%2037.377%2C%2043.167%2037.374%2C%2043.8%2037.23%2C%2044.03%2037.325%2C%2044.279%2037.236%2C%2044.202%2037.098%2C%2044.351%2037.049%2C%2044.771%2037.167%2C%2044.921%2037.02%2C%2044.861%2036.784%2C%2045.072%2036.691%2C%2045.108%2036.419%2C%2045.284%2036.383%2C%2045.279%2036.253%2C%2045.387%2036.085%2C%2045.557%2036.001%2C%2046.093%2035.861%2C%2046.349%2035.809%2C%2046.013%2035.678%2C%2046.156%2035.287%2C%2046.203%2035.198%2C%2046.189%2035.108%2C%2046.064%2035.036%2C%2045.883%2035.031%2C%