---

## Get FIRMS API Key (MAP_KEY)

---

In order to provide access to FIRMS data, we require our users to sign up for a **FREE** API / map key (we call it: MAP_KEY).

The MAP_KEY was designed to conserve FIRMS resources, so everyone could have a reasonable access to our data. For example, a download script with an error could end up requesting too much data or query our database with high frequency.

FIRMS MAP_KEY was originally designed to facilitate only mapserver visualization queries, but now can be used for other requests. For example:

- Web Map Service (WMS)
- Web Feature Service (WFS)
- API

To sign up, visit https://firms.modaps.eosdis.nasa.gov/api/map_key

FIRMS MAP_KEY limits your usage to a 10-minute window. So if you exceed your limit, after 10 minutes, it resets so you can use the system again.

---

## Test Your MAP_KEY

---

In [1]:
# Let's set your map key that was emailed to you. It should look something like 'abcdef1234567890abcdef1234567890'
MAP_KEY = '5e8bad8d50fa1ca84ea72175e2bace34' #key from guy.80647@gmail.com
#MAP_KEY = 'abcdef1234567890abcdef1234567890'

# now let's check how many transactions we have
import pandas as pd
url = 'https://firms.modaps.eosdis.nasa.gov/mapserver/mapkey_status/?MAP_KEY=' + MAP_KEY
try:
  df = pd.read_json(url,  typ='series')
  display(df)
except:
  # possible error, wrong MAP_KEY value, check for extra quotes, missing letters
  print ("There is an issue with the query. \nTry in your browser: %s" % url)


transaction_limit             5000
current_transactions             0
transaction_interval    10 minutes
dtype: object

In [2]:
# let's create a simple function that tells us how many transactions we have used.
# We will use this in later examples

def get_transaction_count() :
  count = 0
  try:
    df = pd.read_json(url,  typ='series')
    count = df['current_transactions']
  except:
    print ("Error in our call.")
  return count

tcount = get_transaction_count()
print ('Our current transaction count is %i' % tcount)

Our current transaction count is 0


---

## API/data_availability

---

This service is designed to inform users about date range availability of our supported datasets.

For more information visit https://firms.modaps.eosdis.nasa.gov/api/data_availability

Let's see the full list of available sensors and their supported date ranges.


In [2]:
# let's query data_availability to find out what date range is available for various datasets
# we will explain these datasets a bit later

# this url will return information about all supported sensors and their corresponding datasets
# instead of 'all' you can specify individual sensor, ex:LANDSAT_NRT
da_url = 'https://firms.modaps.eosdis.nasa.gov/api/data_availability/csv/' + MAP_KEY + '/all'
df = pd.read_csv(da_url)
display(df)

Unnamed: 0,data_id,min_date,max_date
0,MODIS_NRT,2025-01-01,2025-04-17
1,MODIS_SP,2000-11-01,2024-12-31
2,VIIRS_NOAA20_NRT,2025-02-01,2025-04-17
3,VIIRS_NOAA20_SP,2018-04-01,2025-01-31
4,VIIRS_NOAA21_NRT,2024-01-17,2025-04-17
5,VIIRS_SNPP_NRT,2025-02-01,2025-04-17
6,VIIRS_SNPP_SP,2012-01-20,2025-01-31
7,LANDSAT_NRT,2022-06-20,2025-04-16
8,GOES_NRT,2022-08-09,2025-04-17
9,BA_MODIS,2000-11-01,2024-12-01


**data_id** column shows the dataset id which we will need in later queries:
- 'NRT' means this is Near Real-Time dataset but it may also includes Real Time (RT) and Ultra Real Time (URT) data [click here more info on URT/RT](https://www.earthdata.nasa.gov/data/tools/firms/faq)
- 'SP' or Standard Processing; standard data products are an internally consistent, well-calibrated record of the Earth’s geophysical properties to support science. There is a multi-month lag in this dataset availability. [more information on SP vs NRT](https://www.earthdata.nasa.gov/data/tools/firms/faq)
- BA_MODIS is for MODIS burned areas product

**min_date** and **max_date** columns provide the available date range for these datasets. Dates are based on GMT

In [3]:
# now let's see how many transactions we use by querying this end point

start_count = get_transaction_count()
pd.read_csv(da_url)
end_count = get_transaction_count()
print ('We used %i transactions.' % (end_count-start_count))

# now remember, after 10 minutes this will reset

NameError: name 'get_transaction_count' is not defined

---

## API/area

---

Fire detection hotspots based on area, date and sensor. For more information visit https://firms.modaps.eosdis.nasa.gov/api/area

The end point expects these parameters: [MAP_KEY], [SOURCE], [AREA_COORDINATES],[DAY_RANGE] and optionally [DATE] for historical data

**NOTE** - querying the entire world for VIIRS can return between 30,000 - 100,000+ records per day


In [4]:
# in this example let's look at VIIRS NOAA-20, entire world and the most recent day
area_url = 'https://firms.modaps.eosdis.nasa.gov/api/area/csv/' + MAP_KEY + '/VIIRS_NOAA20_NRT/world/1'
start_count = get_transaction_count()
df_area = pd.read_csv(area_url)
end_count = get_transaction_count()
print ('We used %i transactions.' % (end_count-start_count))

df_area

NameError: name 'get_transaction_count' is not defined

In [None]:
# We can also focus on a smaller area ex. South Asia and get the last 3 days of records
area_url = 'https://firms.modaps.eosdis.nasa.gov/api/area/csv/' + MAP_KEY + '/VIIRS_NOAA20_NRT/54,5.5,102,40/3'
df_area = pd.read_csv(area_url)
df_area

Unnamed: 0,latitude,longitude,bright_ti4,scan,track,acq_date,acq_time,satellite,instrument,confidence,version,bright_ti5,frp,daynight
0,14.76761,101.89860,330.46,0.60,0.71,2025-03-19,538,N20,VIIRS,n,2.0NRT,291.73,5.33,D
1,15.16837,101.53233,330.71,0.62,0.72,2025-03-19,538,N20,VIIRS,n,2.0NRT,293.02,3.93,D
2,15.17142,100.80879,333.15,0.70,0.75,2025-03-19,538,N20,VIIRS,n,2.0NRT,292.99,4.77,D
3,15.24884,101.91001,336.98,0.59,0.70,2025-03-19,538,N20,VIIRS,n,2.0NRT,291.97,5.30,D
4,15.25094,101.90823,334.18,0.59,0.70,2025-03-19,538,N20,VIIRS,n,2.0NRT,290.92,6.41,D
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
28299,23.19221,55.46107,305.49,0.66,0.73,2025-03-20,2252,N20,VIIRS,n,2.0NRT,289.56,1.89,N
28300,23.19379,55.45491,310.88,0.66,0.73,2025-03-20,2252,N20,VIIRS,n,2.0NRT,287.98,1.89,N
28301,23.19461,55.46127,308.68,0.66,0.73,2025-03-20,2252,N20,VIIRS,n,2.0NRT,289.19,2.11,N
28302,23.31620,54.17369,304.45,0.54,0.68,2025-03-20,2252,N20,VIIRS,n,2.0NRT,292.16,1.62,N


List of supported countries and their 3-letter codes. This may be easier to read in html format https://firms.modaps.eosdis.nasa.gov/api/countries/?format=html however, you won't be able to see the exent box defined for each country.

Example below shows how you can view from Python.

In [5]:
# We can also focus on smaller area ex. South Asia and get last 3 days of records
countries_url = 'https://firms.modaps.eosdis.nasa.gov/api/countries'
df_countries = pd.read_csv(countries_url, sep=';')
df_countries

Unnamed: 0,id,abreviation,name,extent
0,1,ABW,Aruba,"BOX(-70.0624080069999 12.417669989,-69.8768204..."
1,2,AFG,Afghanistan,"BOX(60.4867777910001 29.3866053260001,74.89230..."
2,3,AGO,Angola,"BOX(11.6693941430001 -18.0314047239998,24.0617..."
3,4,AIA,Anguilla,"BOX(-63.4288223949999 18.1690941430001,-62.972..."
4,6,ALA,Aland Islands,"BOX(19.5131942070001 59.9044863950001,21.09669..."
...,...,...,...,...
239,234,WSM,Samoa,"BOX(-172.782582161 -14.052829685,-171.43769283..."
240,235,YEM,Yemen,"BOX(42.5457462900001 12.1114436720001,54.54029..."
241,236,ZAF,South Africa,"BOX(16.4699813160001 -46.965752863,37.97779381..."
242,237,ZMB,Zambia,"BOX(21.9798775630001 -18.0692318719999,33.6742..."


---

## API/country

---

Provides data specific to a country, although not recommended for large countries such as USA, China, Canada, Russia due to the complexity and size of their polygon shape which may cause the query to time out. To figure out the country code, see [/api/countries](https://firms.modaps.eosdis.nasa.gov/api/countries/?format=html) (example above).

In [2]:
# Let's see last four days MODIS data for Thailand
thai_url = 'https://firms.modaps.eosdis.nasa.gov/api/country/csv/' + MAP_KEY + '/MODIS_NRT/THA/2'
df_thai = pd.read_csv(thai_url)
df_thai

Unnamed: 0,country_id,latitude,longitude,brightness,scan,track,acq_date,acq_time,satellite,instrument,confidence,version,bright_t31,frp,daynight
0,THA,8.44048,98.52376,313.44,1.56,1.23,2025-04-27,741,Aqua,MODIS,47,6.1NRT,295.14,10.53,D
1,THA,9.98310,99.06538,315.90,1.35,1.15,2025-04-27,741,Aqua,MODIS,57,6.1NRT,294.48,10.25,D
2,THA,10.13074,99.09013,315.03,1.34,1.15,2025-04-27,741,Aqua,MODIS,59,6.1NRT,296.18,8.90,D
3,THA,11.73519,99.62222,311.30,1.18,1.08,2025-04-27,741,Aqua,MODIS,62,6.1NRT,292.36,8.69,D
4,THA,14.00413,101.48232,318.39,1.00,1.00,2025-04-27,741,Aqua,MODIS,67,6.1NRT,294.74,7.89,D
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
131,THA,13.91814,101.92397,316.35,1.10,1.04,2025-04-28,238,Terra,MODIS,53,6.1NRT,295.18,6.41,D
132,THA,14.81347,104.34188,332.61,1.00,1.00,2025-04-28,238,Terra,MODIS,81,6.1NRT,303.00,15.67,D
133,THA,14.81478,104.33283,327.03,1.00,1.00,2025-04-28,238,Terra,MODIS,66,6.1NRT,303.67,8.64,D
134,THA,14.86262,104.39226,324.01,1.00,1.00,2025-04-28,238,Terra,MODIS,44,6.1NRT,301.76,6.01,D


## Change data format "acq_time"

In [5]:
import datetime

# Ensure "acq_time" is a string, then pad with zeros
df_thai["acq_time"] = df_thai["acq_time"].astype(str).str.zfill(4)

# Convert to HH:MM format
df_thai["acq_time"] = pd.to_datetime(df_thai["acq_time"], format="%H%M").dt.time
# df_thai["acq_time"] = df_thai["acq_time"].apply(lambda x: datetime.datetime.strptime(x, "%H:%M:%S").time())

# Check the result
print(df_thai.dtypes)
print(df_thai["acq_time"].head())  # Display first few converted times

country_id     object
latitude      float64
longitude     float64
brightness    float64
scan          float64
track         float64
acq_date       object
acq_time       object
satellite      object
instrument     object
confidence      int64
version        object
bright_t31    float64
frp           float64
daynight       object
dtype: object
0    02:56:00
1    02:56:00
2    02:56:00
3    02:56:00
4    02:56:00
Name: acq_time, dtype: object


In [4]:
#create coloumn acq_datetime
df_thai['acq_datetime'] = pd.to_datetime(df_thai['acq_date'] + ' ' + df_thai['acq_time'].astype(str).str.zfill(4), format='%Y-%m-%d %H%M')
df_thai['acq_datetime'].head(5)


0   2025-04-27 07:41:00
1   2025-04-27 07:41:00
2   2025-04-27 07:41:00
3   2025-04-27 07:41:00
4   2025-04-27 07:41:00
Name: acq_datetime, dtype: datetime64[ns]

In [5]:
import pytz
#change timazone to Bangkok
df_thai['acq_datetime_th'] = df_thai['acq_datetime'].dt.tz_localize('GMT').dt.tz_convert('Asia/Bangkok')
df_thai['acq_datetime_th'].head(5)


0   2025-04-27 14:41:00+07:00
1   2025-04-27 14:41:00+07:00
2   2025-04-27 14:41:00+07:00
3   2025-04-27 14:41:00+07:00
4   2025-04-27 14:41:00+07:00
Name: acq_datetime_th, dtype: datetime64[ns, Asia/Bangkok]

In [6]:
unique_values = df_thai["acq_time"].unique()
print(unique_values)

[datetime.time(2, 56) datetime.time(2, 58) datetime.time(7, 5)
 datetime.time(7, 7) datetime.time(7, 44) datetime.time(7, 46)
 datetime.time(14, 25) datetime.time(14, 27) datetime.time(19, 55)
 datetime.time(2, 36) datetime.time(8, 24) datetime.time(15, 5)
 datetime.time(15, 7) datetime.time(3, 15) datetime.time(3, 17)
 datetime.time(7, 24) datetime.time(14, 8) datetime.time(2, 17)
 datetime.time(2, 19)]


In [12]:
df_thai.head(100)

Unnamed: 0,country_id,latitude,longitude,brightness,scan,track,acq_date,acq_time,satellite,instrument,confidence,version,bright_t31,frp,daynight,acq_datetime,acq_datetime_th
0,THA,8.44048,98.52376,313.44,1.56,1.23,2025-04-27,741,Aqua,MODIS,47,6.1NRT,295.14,10.53,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00
1,THA,9.98310,99.06538,315.90,1.35,1.15,2025-04-27,741,Aqua,MODIS,57,6.1NRT,294.48,10.25,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00
2,THA,10.13074,99.09013,315.03,1.34,1.15,2025-04-27,741,Aqua,MODIS,59,6.1NRT,296.18,8.90,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00
3,THA,11.73519,99.62222,311.30,1.18,1.08,2025-04-27,741,Aqua,MODIS,62,6.1NRT,292.36,8.69,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00
4,THA,14.00413,101.48232,318.39,1.00,1.00,2025-04-27,741,Aqua,MODIS,67,6.1NRT,294.74,7.89,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,THA,19.66304,98.71753,326.00,1.07,1.03,2025-04-27,744,Aqua,MODIS,76,6.1NRT,303.16,17.08,D,2025-04-27 07:44:00,2025-04-27 14:44:00+07:00
96,THA,19.67899,98.05052,314.04,1.14,1.06,2025-04-27,744,Aqua,MODIS,32,6.1NRT,299.53,4.85,D,2025-04-27 07:44:00,2025-04-27 14:44:00+07:00
97,THA,19.72550,99.30205,321.04,1.03,1.01,2025-04-27,744,Aqua,MODIS,56,6.1NRT,302.49,9.46,D,2025-04-27 07:44:00,2025-04-27 14:44:00+07:00
98,THA,19.94037,99.37890,330.35,1.02,1.01,2025-04-27,744,Aqua,MODIS,81,6.1NRT,305.96,17.74,D,2025-04-27 07:44:00,2025-04-27 14:44:00+07:00


## Hive partition

In [8]:
import numpy as np
import pandas as pd
import pyarrow

In [13]:
# Extract components from the "acq_date" and "acq_time" columns
df_thai["acq_year"] = pd.to_datetime(df_thai["acq_datetime_th"]).dt.year
df_thai["acq_month"] = pd.to_datetime(df_thai["acq_datetime_th"]).dt.month
df_thai["acq_day"] = pd.to_datetime(df_thai["acq_datetime_th"]).dt.day
df_thai["acq_hour"] = pd.to_datetime(df_thai["acq_datetime_th"]).dt.hour
df_thai["acq_minute"] = pd.to_datetime(df_thai["acq_datetime_th"]).dt.minute


In [14]:
df_thai.head(5)

Unnamed: 0,country_id,latitude,longitude,brightness,scan,track,acq_date,acq_time,satellite,instrument,...,bright_t31,frp,daynight,acq_datetime,acq_datetime_th,acq_year,acq_month,acq_day,acq_hour,acq_minute
0,THA,8.44048,98.52376,313.44,1.56,1.23,2025-04-27,741,Aqua,MODIS,...,295.14,10.53,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00,2025,4,27,14,41
1,THA,9.9831,99.06538,315.9,1.35,1.15,2025-04-27,741,Aqua,MODIS,...,294.48,10.25,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00,2025,4,27,14,41
2,THA,10.13074,99.09013,315.03,1.34,1.15,2025-04-27,741,Aqua,MODIS,...,296.18,8.9,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00,2025,4,27,14,41
3,THA,11.73519,99.62222,311.3,1.18,1.08,2025-04-27,741,Aqua,MODIS,...,292.36,8.69,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00,2025,4,27,14,41
4,THA,14.00413,101.48232,318.39,1.0,1.0,2025-04-27,741,Aqua,MODIS,...,294.74,7.89,D,2025-04-27 07:41:00,2025-04-27 14:41:00+07:00,2025,4,27,14,41


In [None]:
# lakeFS credentials from your docker-compose.yml
ACCESS_KEY = "access_key"
SECRET_KEY = "secret_key"

# lakeFS endpoint (running locally)
lakefs_endpoint = "http://lakefs-dev:8000/"

# lakeFS repository, branch, and file path
repo = "weather"
branch = "main"
path = "firms.parquet"

# Construct the full lakeFS S3-compatible path
lakefs_s3_path = f"s3a://{repo}/{branch}/{path}"

# Configure storage_options for lakeFS (S3-compatible)
storage_options = {
    "key": ACCESS_KEY,
    "secret": SECRET_KEY,
    "client_kwargs": {
        "endpoint_url": lakefs_endpoint
    }
}

In [None]:
# Write DataFrame to a directory "output_parquet" partitioned by retrieval_time
df_thai.to_parquet(
    lakefs_s3_path,
    storage_options=storage_options,
    partition_cols=["acq_year","acq_month","acq_day","acq_hour","acq_minute"],   # <-- crucial for partitioning by retrieval_time
)

In [None]:
#read parquet
path_all_partition = 's3a://weather/main/firms.parquet'

df2=pd.read_parquet(    
    path=path_all_partition,
    storage_options=storage_options
)
df2.info()
df2.head()