# Mineral Search Using the RRUFF and WikiData Databases

The RUFF database contains about 5,700 minerals (many redundant with disparite names).  Many of these come with formula's that can be parsed into distinct compositions.

This workbook will create a SQLite database from the mineral compositions and then search this database.

In [None]:
using NeXLCore
using DataFrames
using SQLite

This code downloads the database from its source (on the Internet) and constructs a DataFrame containing the data.

It also attempts to parse the `IMA Chemistry (plain)` column to convert the mineral to mass-fraction representation.  This isn't always possible as many "minerals" are actually ambiguously defined.

In [None]:
ENV["DATADEPS_ALWAYS_ACCEPT"]=true
mdb = loadmineraldata(true)

Load the minerals with parseable compositions into a in-memory SQLite database.  (You could write it to disk to but...)

In [None]:
db = SQLite.DB()
NeXLCore.buildMaterialTables(db)
for mat in filter(!isnothing, mdb[:,:Material])
    NeXLCore.write(db, Material, mat)
end

Search the database on an elemental basis.

In this case, the search looks for palladium between 0.6062 and 0.6066 mass-fraction and lead between 0.3930 and 0.3940.

This search style is very flexible but a little tedious.

In [None]:
NeXLCore.findall(db, Material, Dict(n"Pd"=>( 0.6062, 0.6066), n"Pb"=>(0.3930, 0.3940)))

Alternatively, you can search by composition.  It is easy to create compositions by mass-fractions using this syntax. 

The final number is a tolerance that is applied to each element. So this is equivalent to Pd between 0.596 and 0.616 and Pb between 0.383 and 0.403. 

In [None]:
NeXLCore.findall(db, mat"0.606*Pd+0.393*Pb", 0.01)

Or like this..

In [None]:
findall(db, mat"NaAlSi3O8", 0.001)

Let's try again with the WikiData database which contains 3711 minerals.

In [None]:
wdm = wikidata_minerals()

In [None]:
db_wd = SQLite.DB()
NeXLCore.buildMaterialTables(db_wd)
for mat in values(wdm)
    NeXLCore.write(db_wd, Material, mat)
end

Let's perform the same searches as before...

In [None]:
NeXLCore.findall(db_wd, Material, Dict(n"Pd"=>( 0.6062, 0.6066), n"Pb"=>(0.3930, 0.3940)))

In [None]:
NeXLCore.findall(db_wd, mat"0.606*Pd+0.393*Pb", 0.01)

In [None]:
findall(db_wd, mat"NaAlSi3O8", 0.001)

Of course, these are very simple searches and must more sophisticated search algorithms can readily be imagined and implemented.

See [materialdb.jl](https://github.com/usnistgov/NeXLCore.jl/blob/master/src/materialdb.jl) to see how the database is organized and how to search it.