In [1]:
import sys
sys.version_info

sys.version_info(major=3, minor=12, micro=5, releaselevel='final', serial=0)

Some new metadata columns have been added to help identify which sample sets have usage restrictions (e.g., publication embargo) and which are available for unrestricted use. The metadata columns are:

* `terms_of_use_expiry_date` - Gives the date on which any terms of use will expire. After this date, there will be no usage restrictions on data relating to the sample set.
* `terms_of_use_url` - Gives the address of a web page that describes any usage restrictions which apply to the sample set.

If the value of either of these fields is empty then there are no terms of use applying to the sample set.

These new metadata columns can be accessed via the `malariagen_data` Python API. The API also addes an additional computed field:

* `unrestricted_use` - This is a computed column which is added for convenience. The value is `True` if the terms of use have expired, or if there were never any usage restrictions applied.

The metadata columns are available in the dataframes returned by the `sample_sets()` and `sample_metadata()` functions. Below are some examples for data from the *Anopheles gambiae* complex accessed via the `Ag3` API. Similar code can be used for *Anopheles funestus* samples via the `Af1` API.

In [2]:
import malariagen_data
ag3 = malariagen_data.Ag3()

In [3]:
malariagen_data.__version__  # diagnostics

'13.0.0'

In [4]:
ag3  # diagnostics

MalariaGEN Ag3 API client,MalariaGEN Ag3 API client
"Please note that data are subject to terms of use,  for more information see the MalariaGEN website or contact support@malariagen.net.  See also the Ag3 API docs.","Please note that data are subject to terms of use,  for more information see the MalariaGEN website or contact support@malariagen.net.  See also the Ag3 API docs..1"
Storage URL,gs://vo_agam_release_master_us_central1/
Data releases available,"3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10"
Results cache,
Cohorts analysis,20240717
AIM analysis,20220528
Site filters analysis,dt_20200416
Software version,malariagen_data 13.0.0
Client location,"England, United Kingdom"


In [5]:
ag3._show_progress = False  # for the blog post, avoid progress outputs

Sample set metadata:

In [6]:
df_sample_sets = ag3.sample_sets()
df_sample_sets[["sample_set", "terms_of_use_expiry_date", "unrestricted_use"]]

Unnamed: 0,sample_set,terms_of_use_expiry_date,unrestricted_use
0,AG1000G-AO,2025-01-01,False
1,AG1000G-BF-A,2025-01-01,False
2,AG1000G-BF-B,2025-01-01,False
3,AG1000G-BF-C,2025-01-01,False
4,AG1000G-CD,2025-01-01,False
...,...,...,...
78,1323-VO-GM-NGWA-VMF00235,2026-04-09,False
79,1323-VO-GM-NGWA-VMF00242,2026-04-09,False
80,1329-VO-GA-CHRISTOPHE-VMF00228,2026-04-09,False
81,bergey-2019,,True


Query to find sample sets with no usage restrictions:

In [7]:
df_sample_sets.query("unrestricted_use")

Unnamed: 0,sample_set,sample_count,study_id,study_url,terms_of_use_expiry_date,terms_of_use_url,release,unrestricted_use
32,fontaine-2015-rebuild,72,fontaine-2015-rebuild,https://doi.org/10.1126/science.1258524,,https://www.science.org/doi/10.1126/science.12...,3.1,True
34,1237-VO-BJ-DJOGBENOU-VMF00050,90,1237-VO-BJ-DJOGBENOU,https://www.malariagen.net/partner_study/1237-...,2024-07-22,https://malariagen.github.io/vector-data/ag3/a...,3.2,True
35,1237-VO-BJ-DJOGBENOU-VMF00067,142,1237-VO-BJ-DJOGBENOU,https://www.malariagen.net/partner_study/1237-...,2024-07-22,https://malariagen.github.io/vector-data/ag3/a...,3.2,True
36,1244-VO-GH-YAWSON-VMF00051,666,1244-VO-GH-YAWSON,https://www.malariagen.net/partner_study/1244-...,2024-07-22,https://malariagen.github.io/vector-data/ag3/a...,3.2,True
37,1245-VO-CI-CONSTANT-VMF00054,38,1245-VO-CI-CONSTANT,https://www.malariagen.net/partner_study/1245-...,2024-07-22,https://malariagen.github.io/vector-data/ag3/a...,3.2,True
38,1253-VO-TG-DJOGBENOU-VMF00052,179,1253-VO-TG-DJOGBENOU,https://www.malariagen.net/partner_study/1253-...,2024-07-22,https://malariagen.github.io/vector-data/ag3/a...,3.2,True
39,1178-VO-UG-LAWNICZAK-VMF00025,57,1178-VO-UG-LAWNICZAK,https://www.malariagen.net/partner_study/1178-...,2023-10-26,https://malariagen.github.io/vector-data/ag3/a...,3.3,True
65,barron-2019,4,barron-2019,https://doi.org/10.1038/s41598-019-49065-5,,https://www.nature.com/articles/s41598-019-490...,3.7,True
66,crawford-2016,25,crawford-2016,https://doi.org/10.1111/mec.13572,,https://onlinelibrary.wiley.com/doi/10.1111/me...,3.7,True
72,tennessen-2021,208,tennessen-2021,https://doi.org/10.1111/mec.15756,,https://onlinelibrary.wiley.com/doi/10.1111/me...,3.8,True


Sample metadata:

In [8]:
df_samples = ag3.sample_metadata()
df_samples[["sample_id", "sample_set", "terms_of_use_expiry_date", "terms_of_use_url", "unrestricted_use"]]

Unnamed: 0,sample_id,sample_set,terms_of_use_expiry_date,terms_of_use_url,unrestricted_use
0,VBS00256-4651STDY7017184,1177-VO-ML-LEHMANN-VMF00004,2025-11-17,https://malariagen.github.io/vector-data/ag3/a...,False
1,VBS00257-4651STDY7017185,1177-VO-ML-LEHMANN-VMF00004,2025-11-17,https://malariagen.github.io/vector-data/ag3/a...,False
2,VBS00259-4651STDY7017186,1177-VO-ML-LEHMANN-VMF00004,2025-11-17,https://malariagen.github.io/vector-data/ag3/a...,False
3,VBS00262-4651STDY7017187,1177-VO-ML-LEHMANN-VMF00004,2025-11-17,https://malariagen.github.io/vector-data/ag3/a...,False
4,VBS00277-4651STDY7017189,1177-VO-ML-LEHMANN-VMF00004,2025-11-17,https://malariagen.github.io/vector-data/ag3/a...,False
...,...,...,...,...,...
19766,SAMN15222632,tennessen-2021,,https://onlinelibrary.wiley.com/doi/10.1111/me...,True
19767,SAMN15222633,tennessen-2021,,https://onlinelibrary.wiley.com/doi/10.1111/me...,True
19768,SAMN15222634,tennessen-2021,,https://onlinelibrary.wiley.com/doi/10.1111/me...,True
19769,SAMN15222635,tennessen-2021,,https://onlinelibrary.wiley.com/doi/10.1111/me...,True


Query to find samples with no usage restrictions:

In [9]:
df_samples.query("unrestricted_use")

Unnamed: 0,sample_id,partner_sample_id,contributor,country,location,year,month,latitude,longitude,sex_call,...,admin1_name,admin1_iso,admin2_name,taxon,cohort_admin1_year,cohort_admin1_month,cohort_admin1_quarter,cohort_admin2_year,cohort_admin2_month,cohort_admin2_quarter
670,VBS10116-4954STDY7089644,UG4A2016A1_96,Mara Lawniczak,Uganda,Busia,2013,1,0.466,34.089,F,...,Eastern Region,UG-E,Busia,gambiae,UG-E_gamb_2013,UG-E_gamb_2013_01,UG-E_gamb_2013_Q1,UG-E_Busia_gamb_2013,UG-E_Busia_gamb_2013_01,UG-E_Busia_gamb_2013_Q1
671,VBS10117-4954STDY7089645,UG4A2016B1_95,Mara Lawniczak,Uganda,Busia,2016,6,0.466,34.089,F,...,Eastern Region,UG-E,Busia,gambiae,UG-E_gamb_2016,UG-E_gamb_2016_06,UG-E_gamb_2016_Q2,UG-E_Busia_gamb_2016,UG-E_Busia_gamb_2016_06,UG-E_Busia_gamb_2016_Q2
672,VBS10118-4954STDY7089646,UG4A2016C1_94,Mara Lawniczak,Uganda,Busia,2016,6,0.466,34.089,F,...,Eastern Region,UG-E,Busia,gambiae,UG-E_gamb_2016,UG-E_gamb_2016_06,UG-E_gamb_2016_Q2,UG-E_Busia_gamb_2016,UG-E_Busia_gamb_2016_06,UG-E_Busia_gamb_2016_Q2
673,VBS10119-4954STDY7089647,UG4A2016D1_93,Mara Lawniczak,Uganda,Busia,2016,6,0.466,34.089,F,...,Eastern Region,UG-E,Busia,gambiae,UG-E_gamb_2016,UG-E_gamb_2016_06,UG-E_gamb_2016_Q2,UG-E_Busia_gamb_2016,UG-E_Busia_gamb_2016_06,UG-E_Busia_gamb_2016_Q2
674,VBS10120-4954STDY7089648,UG4A2016E1_92,Mara Lawniczak,Uganda,Busia,2016,6,0.466,34.089,F,...,Eastern Region,UG-E,Busia,gambiae,UG-E_gamb_2016,UG-E_gamb_2016_06,UG-E_gamb_2016_Q2,UG-E_Busia_gamb_2016,UG-E_Busia_gamb_2016_06,UG-E_Busia_gamb_2016_Q2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19766,SAMN15222632,D342,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016
19767,SAMN15222633,D343,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016
19768,SAMN15222634,D346,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016
19769,SAMN15222635,D347,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016


Example query to combine with other filters:

In [10]:
df_samples.query("country == 'Burkina Faso' and unrestricted_use")

Unnamed: 0,sample_id,partner_sample_id,contributor,country,location,year,month,latitude,longitude,sex_call,...,admin1_name,admin1_iso,admin2_name,taxon,cohort_admin1_year,cohort_admin1_month,cohort_admin1_quarter,cohort_admin2_year,cohort_admin2_month,cohort_admin2_quarter
19466,SAMN03299607,GOUND_0022,Jacob E Crawford,Burkina Faso,Goundry,2008,11,12.518,-1.341,UKN,...,Plateau Central,BF-11,Oubritenga,arabiensis,BF-11_arab_2008,BF-11_arab_2008_11,BF-11_arab_2008_Q4,BF-11_Oubritenga_arab_2008,BF-11_Oubritenga_arab_2008_11,BF-11_Oubritenga_arab_2008_Q4
19467,SAMN03299611,GOUND_0103,Jacob E Crawford,Burkina Faso,Goundry,2008,11,12.518,-1.341,F,...,Plateau Central,BF-11,Oubritenga,arabiensis,BF-11_arab_2008,BF-11_arab_2008_11,BF-11_arab_2008_Q4,BF-11_Oubritenga_arab_2008,BF-11_Oubritenga_arab_2008_11,BF-11_Oubritenga_arab_2008_Q4
19468,SAMN03299612,GOUND_0105,Jacob E Crawford,Burkina Faso,Goundry,2008,11,12.518,-1.341,UKN,...,Plateau Central,BF-11,Oubritenga,arabiensis,BF-11_arab_2008,BF-11_arab_2008_11,BF-11_arab_2008_Q4,BF-11_Oubritenga_arab_2008,BF-11_Oubritenga_arab_2008_11,BF-11_Oubritenga_arab_2008_Q4
19469,SAMN03299614,GOUND_0137,Jacob E Crawford,Burkina Faso,Goundry,2008,11,12.518,-1.341,F,...,Plateau Central,BF-11,Oubritenga,arabiensis,BF-11_arab_2008,BF-11_arab_2008_11,BF-11_arab_2008_Q4,BF-11_Oubritenga_arab_2008,BF-11_Oubritenga_arab_2008_11,BF-11_Oubritenga_arab_2008_Q4
19470,SAMN03299615,KODOU_0009,Jacob E Crawford,Burkina Faso,Kodougou,2008,11,12.520,-3.607,F,...,Boucle du Mouhoun,BF-01,Kossi,arabiensis,BF-01_arab_2008,BF-01_arab_2008_11,BF-01_arab_2008_Q4,BF-01_Kossi_arab_2008,BF-01_Kossi_arab_2008_11,BF-01_Kossi_arab_2008_Q4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19766,SAMN15222632,D342,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016
19767,SAMN15222633,D343,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016
19768,SAMN15222634,D346,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016
19769,SAMN15222635,D347,Jacob Tennessen,Burkina Faso,Tengrela,2016,-1,10.700,-4.800,F,...,Cascades,BF-02,Comoe,coluzzii,BF-02_colu_2016,BF-02_colu_2016,BF-02_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016,BF-02_Comoe_colu_2016


**Note that all sample sets in the vector observatory can be accessed and analysed at any time for public health purposes. If any terms of use apply, they may restrict the public communication of any analysis results (publication embargo) for a period of time.**

If you have any questions about usage restrictions, please get in touch via support@malariagen.net.