New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EELS DB Website Integration #475
Comments
Hi Phil, The new site looks great! The integration of the EELSdb with HyperSpy has been discussed at lenght in this post and in this issue #146. In summary, most of us think that it'll be truly cool to provide direct access to the EELSdb through HyperSpy. As I understand from your message, the API to query the database and download the spectra is already in place. If you provide us with the details will be very happy to start working on it. Francisco |
Looks nice! Having some way to import/export directly between EELSdb and HyperSpy would be very nice. Is there some info about the API anywhere? I tried uploading a msa-spectrum exported from HyperSpy, but it doesn't seem to work properly: http://eelsdb.eu/?post_type=spectra&p=20142 |
Phil has already began to work on the API. More details can be find there: http://eelsdb.eu/api/ Thanks for the report Magnus, we will have a look on it Luc |
In HyperSpy we've done our best to comply with the EMSA/MSA standard, Francisco On 02/03/15 14:46, llajaunie wrote:
|
Ok, nice! Thanks for the fast responses. I wasn't aware that there was a standard. I'm pretty sure that it'll just be a case of the parsing on our side, I've already adapted it a couple of times to fix similar issues. I'll do some reading and see if I can make it a little more robust. I'll also try to read through all of the conversations you guys have already had on this topic. It sounds like we're on the same page though, which is excellent news. Phil |
@magnunor: I just had a look at the file you uploaded. It doesn't work as I wasn't aware that |
In case someone else is experimenting with this, the following code will get the data in http://api.eelsdb.eu/spectra, grab the first spectrum and plot it: import hyperspy.hspy as hspy
import urllib2
import json
def get_eelsdb_data(url=None):
if url:
url = url
else:
url = 'http://api.eelsdb.eu/spectra/'
headers = {'User-Agent': 'HyperSpy'}
req = urllib2.Request(url, headers=headers)
response = urllib2.urlopen(req)
return(response)
all_spectra = json.load(get_eelsdb_data())
spectra_link = all_spectra[0]['download_link']
raw_msa_file = get_eelsdb_data(spectra_link)
file_output = open('test_spectra.msa','w')
file_output.write(raw_msa_file.read())
file_output.close()
s = hspy.load('test_spectra.msa')
s.plot()
hspy.plt.savefig("test_spectra.png") Note you need to specify some kind of user agent, or you'll get a 403 HTTP Error. Currently it is a bit convoluted, since it is saving a file, then loading it. We could probably open the text from the urllib2 response directly with the io/msa plugin. We should probably discuss how to best implement this in HyperSpy, should we keep that discussion in this issue, or make another? |
I added a way to import the EELSdb spectrums straight into HyperSpy spectrums using the msa io-plugin:
The implementation is not very robust, but seems to work nicely :) |
Nice! I've been working in an alternative implementation. I'll make a PR in one hour or so so that we can discuss the different approaches. |
Hi all, @magnunor - great that you managed to get something working so quickly! Please don't put too much work into it yet though, what's there currently was basically the result of me having a play whilst waiting for a flight at the airport and is liable to change lots yet.. I just started writing a really long rambling post with all of my thoughts on this after reading your previous discussions. I think what it boils down to is that proper authentication with The site is built using WordPress, using a number of plugins including one I've written to handle everything to do with the spectra. The WP community is kind of huge and as I was writing this I came across what appears to be quite a mature plugin to handle externally facing APIs: WP-API. From what I've just read of the docs it sounds like I should be able to extend it to do basically everything we would need. As such, don't go ahead with the current API yet as I'm probably going to have to ditch that code and start again. The result structure will almost certainly change as a result. Routes I'll aim to create:
Can anyone think of any I've missed? I've stopped short of adding stuff like the forum and job postings, plus page content. People can go to the site for that stuff. I don't think I want to open up user authentication via the API, so users will have to visit the site to register, and also to allow API access for the first time with a new token. This is the same as most other website APIs I've used (eg. GitHub). I'm personally not keen on the idea of using a local version of the database, however I think caching results would be sensible and would certainly help limiting excess resource usage on the site (the server setup isn't hugely beefy). If we start hitting slow down trouble we might have to limit access to authenticated users only and then serve only new/modified spectra from Getting the above to work properly is quite a lot of work. I'm inclined to put this off until the main site has gone live to avoid further delays. @llajaunie - what do you think? HyperSpy people - what do you think? Would this do everything you want? How do you envisage extending HyperSpy to make use of this API? Apologies, I'm completely naïve about your software - am I right in assuming that it's essentially script / command line usage only, or does it use a graphical interface? How do you plan to get the authentication to work? How much work will be required from your end? Phil |
Also - the route URLs I've written above are just what makes sense in my head and may change when I come to actually write this code if alternatives are easier. For instance, I think WP-API uses a URL structure that would look like But yes, take everything with a pinch of salt for now ;) |
Also, it sounds like the team behind WP-API are aiming squarely at getting integrated into the wordpress core, which would be really nice for the EELS website but also potentially mean that the code base behind the new API would be a bit unstable. I found some posts saying that they were aiming for version 3.8, then 4.1 and now "There's no fixed timeline for integration into core at this time, but getting closer!" (we're currently on v4.1.1). This could be a good reason to incubate this idea for a while, whilst the main site goes live. Though I think it's dangerous for me to wait for them fully as it could take them forever. So I'll carry on regardless for now I think. Phil |
@magnunor, I have just made a PR (#477) with an alternative implementation of the feature. The main differences with your implementation are:
I like the idea of enabling loading from the web, but I think that, if we provide the feature, we should provide it for all the supported file formats, what do you think? Regarding requests vs urllib2, I think that using requests the code is simpler and more readable. Of course, the drawback is that we'll need to depend on one extra library. However, requests seems to be very popular, so this might not be an issue in this case. What do you think? |
Having hspy.load accept URLs (for any file format) would be nice even without the DB, although it might be more tricky to correctly determine file type. Maybe prompt if unable to determine? |
@ewels, as you see, we were really desperate to have the API :) The current API is working really well for our purposes. Of course, enabling submission of spectra through the API would be the cherry on the cake and we hope that it'll help you gather more spectra for the data base. Actually, I very much like @Gedeonval compare2DB method that suggests to submit the spectrum to the data base if a query returns no spectra. I think that this can become a powerful tool to get more spectra into the EELS Data Base. HyperSpy is not GUI based, but @vidartf is working on a GUI for HyperSpy, HyperSpyUI. We'll have to discuss what's the best way of implementing spectra submission once the API is ready, but I think that we can make it reasonably easy for the end-user. We're going to release a new version of HyperSpy on the 28th of March and we would like to have this feature in if possible. It's not a problem at all if you decide to change the API in the close future, just let us know in good time so that we can adapt our code to the new version before you remove the current version of the API. I have some few comments:
|
@francisco-dlp Just a quick reply to your suggestions/remark regarding the "EELS" part of the website.
The point of the new website is not only to add some functionality but also to keep the use simple. It is, in my opinion, the only way to attract more users and to encourage submission. @ewels As you said, i think it can wait :) |
@francisco-dlp |
And what is the license for the spectra in the database? |
After thinking about it a bit more I think that it might not be such a good idea after all to submit spectra directly from HyperSpy using the API. For that we'll need to replicate the EELSdb submission form for the command line interface, the GUI and the IPython notebook and update the code with every new version of the submission form. This is of course doable but, is it worth it? What I suggest to do instead is to implement a |
@magnunor Regarging the copyright, there is a kind of blur as it was never clearly explicited for the old website. I was however thing put the database itself and the spectra under a CC licence CC-BY-NC-SA. Any thoughts on this welcome. |
@francisco-dlp - this approach would be preferable for me as this means the complexity and requirements for the API are greatly reduced. The only problem is that you can't pre-fill a file upload input in a web page form (browser's don't allow this for security reasons). But saving the file and opening that directory so that the user can drag and drop it onto the form would work. With regard to your earlier questions:
@magnunor: Yes, I think I should probably log IP addresses (or logged in users, or both) and impose rate limiting on API calls. I need to think about how to implement this. Ok, so in summary it sounds like maybe I should take a step back from my previous (quite ambitious) plan for the API that I wrote above. Given the time pressure of impending releases both for HyperSpy and the website, how does this sound:
I think this is feasible within the time frame that we're talking about (~2 weeks). Authentication / depositing data through the API can be revisited at a later date when everything else is stable. Phil |
In the submission form you collect important information—e.g. collection angle, beam energy etc—that can be stored in the msa file. What I suggest is to add/overwrite the metadata in the msa file with the information collected in the submission form. Currently it could happen that the msa file contains wrong or incomplete information what could mislead the users of the database.
Then, why don't you stop asking for the edges at all? They can be automatically inferred from the chemical formula and the spectral range.
Personally I would hesitate to use spectra with the background removed, so I think that it'll be good to be able to query the data base for spectra that don't suffer from background removal.
I see the point. However, I'm sure that you'll agree with me in that all post-processing can deal to artifacts so, for a data base that serves reference purposes, I think that it is of great importance to store the raw data. Actually, given that the submitter certainly has the raw data, in my opinion providing it should be mandatory. In the future, you could use HyperSpy on the server side to post-process the spectra.
On the subject of encouraging submission, I wonder if allowing anonymous publishing would help. Some people may like to upload spectra to get credit and attract publicity to their papers—I think that you address those efficiently already. However, others may have good spectra that they hesitate to upload because they may feel judged if their name appears alongside the spectra. Of course I'm not suggesting fully anonymous submission, but you could store the submitter but don't display it publicly. Also, it might be a good idea to encourage submission of spectra that has not been published. Many experiments don't deal to publication but they still produce excellent spectra that might be worth storing in the data base. This is a win-win situation: the EELS data base gets more spectra and the submitter can collect citations for the data even if it didn't deal to journal publication (for this the data must have a DOI, but these days this is easy see e.g. Zenodo) |
Some further suggestions and comments:
|
@ewels, your plan sounds excellent to me.
Couldn't you provide an API for that? |
@francisco-dlp - Much of what you suggest sounds nice, but a lot of it involves much more involved data file interrogation. Specifically, to get metadata from the file, the submission will need to be re-written into a two stage process. This is fine in principle but we're quite a long way into development to be making such changes. Maybe version 3 of the website..? |
I'm not sure what you mean. This is a limitation of the web browser software, not the website. If you mean an API for uploading spectra, then yes - but that has the limitations discussed above (security considerations, authentication, amount of work for implementation). |
What I mean is an API to pass the path of the file, not the file itself, is that possible? Authentication should be handled by the website. |
Do you have Python in the server? If you do, you could use our msa reader/writer instead of writing your own. |
Getting the file path is not a problem, it's what to do with it. You can set variables in any HTML form element except for file uploads.
Yup, we have Python. However, I'm not sure what restrictions we have on installing Python modules. Certainly, if we want to go down the route of parsing file uploads (and probably storing the raw data in a database instead of files) then this would be a nice way to go if possible. |
That makes a lot of sense, thanks! We'll implement the method that you propose then. Regarding Python, if you cannot install the required libraries, we can always take out the msa reader/writer and the dm3 reader from HyperSpy. The msa reader only requires Python. The dm3 reader also requires numpy. |
pps. I got onto a bit of a roll and also added functionality for listing registered users and website news (no-one is going to use the news I suspect, but it was easy to implement) |
@francisco-dlp / @magnunor - we're about to submit a manuscript about the website, including a description of the API. We'd like to mention integration with HyperSpy - is this ok? We were thinking of including an example line of code to show how easy it will be to pull down spectra from the EELS DB. I grabbed this from the above thread: utils.plot.plot_spectra( datasets.tem_eels.eelsdb( element=["Fe", "O"], type="coreloss" ) ) Would you be happy about this? No problem if not.. Here's the catch - @llajaunie is hoping to submit tomorrow (Sunday 1st Nov). Feel free to make any number of rude comments about our organisational skills! |
Yup, that's no problem. Ok if we write that it's "soon to be released" or something similar? |
That's perfectly ok. Actually we have a few features almost ready to ship, I'll update the code and test the new API next week. 2015-10-31 14:55 GMT+00:00 Phil Ewels notifications@github.com:
|
Fantastic! Thank you! |
I am experiencing a problem with the certificate of the api. This is how Firefox complains:
The Python code also complains similarly but I can workaround the issue by disabling certificate verification. @ewels, can you get a valid certificate for the api? |
Damn, yes I had noticed this in my browser. I was hoping that it wouldn't cause a problem but evidently it will. I'll look into buying a second SSL certificate to cover the |
Great. I'll disable certificate verification in the mean time so that we can test the code. Let us know when you get the certificate so that we can re-enable it. |
Yup! It's now ordered, so should come through pretty soon. |
Thank, that was quick! |
Hi @francisco-dlp - SSL certificate installed, so should be working properly now. Chrome has stopped complaining about it, so I think we're good. |
Thanks @ewels, it works fine now. I have updated the code in #477 to work with the new version of the API and (almost) everything seems to work nicely. Very nice job! I have found some few minor issues:
Finally, I think that it would be useful to query the data base also using the id key in order to be able to refer unambiguously to a certain spectrum. |
|
HyperSpy doesn't support csv because csv is not a standard. Wouldn't it be possible to upload in the spectrum in the msa format? Some of the spectra with the best resolution are monochromated as explained in "description". For example https://api.eelsdb.eu/spectra/diamond-6/ |
Maybe I will check what I have letf on this experience. Indeed there is some which are monochromated. I will edit them in later 2015-11-04 17:15 GMT+01:00 Francisco de la Peña notifications@github.com:
|
Monochromated spectra have been edited. I hope I did missed some I added the bug report on search by energy resolution |
How can I get only monochromated spectra with the API? I've tried https://api.eelsdb.eu/spectra/?monochromated=Yes but it doesn't work. |
I've never been able to test that filter as there have never been any spectra. I'm on a plane now but will try to look into the above issues tomorrow. Phil |
Hi @francisco-dlp, To get monochromated spectra, you need Sorry about this, took me a while to figure it out. I've updated the docs page. Apologies for the delay. Phil |
Hi @ewels, I've fixed it and hopefully. Hopefully it'll be ready to merge now. Thanks! |
Brilliant! :) |
Merged right, so this can be closed? |
Yup, thanks everyone! |
@ewels, currently the EELSdb integration tests are failing as follows:
According to my browser, it seems that the certificate has expired:
|
Aha, that time of year again is it? Annoying that I didn't get any warning about this expiring, my apologies. I'll look into getting it renewed now. Phil |
Ok, just waiting for approval from @llajaunie then will purchase replacements. Would like to use letsencrypt.org but seems to be difficult on shared hosts. |
Also, @francisco-dlp - I think that the SSL certificate for |
Thanks @ewels, now it is working again. 2016-06-20 14:50 GMT+01:00 Phil Ewels notifications@github.com:
|
Hi there,
I'm the web developer behind a recent redevelopment of the EELS Database website. The old site (http://pc-web.cemes.fr/eelsdb/) has been around for years and serves as a public repository for EELS spectra.
The new site (http://eelsdb.eu) has been re-written from scratch and is currently in beta, due to go live soon. The main objective was to open up the site a lot more - both to make data submission far simpler, but also to make it easier to browse and access the data. There are currently ~220 spectra on the site and we hope it will grow significantly after launch.
A recent addition to the site is a new API. It's currently very simple - it allows spectra metadata to be retrieved (with a link to the raw data file), and browsed / searched. It could be extended to handle other features such as data uploads and comments (which would presumably require authentication of some kind). Data is currently returned as JSON.
I'm not a physicist myself, so I'm totally out of my depth when it comes to the science of these spectra (I'm a bioinformatician in my day job). However, Luc Lajaunie, who is handling the project said that integration between the site and HyperSpy could be cool! Certainly it would be nice to lower the boundaries for deposition and retrieval of data on the archive in any way possible. For users of HyperSpy it could be an opportunity to easily access a large database of spectra for comparison / use.
Let me know if you guys on the project would be interested in any such integration. Probably a good start would be to have a look at the beta site (I think you can register yourselves, or drop me a mail), then if you're keen we can chat about the scope of any work.
Phil
The text was updated successfully, but these errors were encountered: