Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It adds a method at TRestTools to do http POST requests #172

Merged
merged 14 commits into from
Apr 1, 2022

Conversation

jgalan
Copy link
Member

@jgalan jgalan commented Mar 31, 2022

Large jgalan 162

  • It adds dependency with libcurl libraries
  • It adds a static method TRestTools::POSTRequest to do POST requests to a website.
  • TRestTools::ReadASCIITable added an optional argument to skip header lines.
  • TRestMetadata::GetSearchPath now adds also REST_USER_PATH as a default.
  • TRestTools::ExportASCIITable method added to write a file with the contents of std::vector<std::vector>>.

@rest-for-physics/core_dev

@jgalan jgalan requested review from nkx111 and Vindaar March 31, 2022 06:57
@jgalan jgalan marked this pull request as draft March 31, 2022 06:58
@jgalan
Copy link
Member Author

jgalan commented Mar 31, 2022

@lobis there is no libcurl installed at the docker? It was a library ready at sultan. Could you check in your docker image if it is available? tx

@lobis
Copy link
Member

lobis commented Mar 31, 2022

@lobis there is no libcurl installed at the docker? It was a library ready at sultan. Could you check in your docker image if it is available? tx

curl is installed, you can check via

docker run -it ghcr.io/lobis/root-geant4-garfield:cpp17_ROOT-v6-26-00_Geant4-v10.4.3_Garfield-af4a1451 curl --help

Perhaps the problem is elsewhere or I have to install some dev dependency @jgalan ?

Also I propose to replace the docker image to ghcr.io/lobis/root-geant4-garfield:rest-for-physics so we don't have to modify it anymore if we updated Garfield, Geant4 or root. This tag will point to a fixed image.

@jgalan
Copy link
Member Author

jgalan commented Mar 31, 2022

Did you have a look to the pipeline error? It does not find #include <curl/curl.h>

@lobis
Copy link
Member

lobis commented Mar 31, 2022

This is the correct output, its printing the help menu for curl, so it is installed

@lobis
Copy link
Member

lobis commented Mar 31, 2022

Did you have a look to the pipeline error? It does not find #include <curl/curl.h>

How can I Installed this? curl is already installed, I cannot find an apt package called libcurl-dev if that's whats missing.

@jgalan
Copy link
Member Author

jgalan commented Mar 31, 2022

Maybe curl is installed, but not the development libraries. Headers are not found.

sudo apt-get install libcurl-dev

@jgalan
Copy link
Member Author

jgalan commented Mar 31, 2022

@lobis
Copy link
Member

lobis commented Mar 31, 2022

#include <curl/curl.h>

I think it is libcurl4-openssl-dev. It should be installed now.

@lobis
Copy link
Member

lobis commented Mar 31, 2022

Perhaps we could use ROOT built in libraries to perform this request and thus avoid having to link this library to the framework, as not everybody will have it installed. (as we saw its not on Ubuntu 20.04 by default).

https://root-forum.cern.ch/t/http-get-request/16891/3

ROOT can download files via http so I am sure this is possible.

If this is not the case or its too hard, perhaps we could add this header to the external dependencies via cmake?

Also why don't we use http GET with query parameters to retrieve the data instead of POST?

@jgalan
Copy link
Member Author

jgalan commented Mar 31, 2022

Perhaps we could use ROOT built in libraries to perform this request and thus avoid having to link this library to the framework, as not everybody will have it installed. (as we saw its not on Ubuntu 20.04 by default).

https://root-forum.cern.ch/t/http-get-request/16891/3

Yes, I looked to that root forum thread previously and I tried the w3.C macro they provide without success. After a non-negligible amount of time trying out. Then I realised that this post was written 8 years ago, then I thought maybe the protocols have changed, etc.

ROOT can download files via http so I am sure this is possible.

Yes, likely, and we have our own method TRestTools::DownloadRemoteFile which I find quite convenient, as I find convenient to have TRestTools::POSTRequest.

If this is not the case or its too hard, perhaps we could add this header to the external dependencies via cmake?

I added it to the CMakeLists.txt a bit manually. We could perhaps have a REQUIRED curl in cmake so that the cmake execution fails and the system administrator is aware this library needs to be installed. It is a very popular library. In sultan it was already installed.

Also why don't we use http GET with query parameters to retrieve the data instead of POST?

I dont know, this was suggested by @Vindaar I am just following. Of course, we could have another method TRestTools::GETRequest but I am not sure about the benefits or issues of using one or the other method.

Plz, will you add curl-dev to the docker? :)

@lobis
Copy link
Member

lobis commented Mar 31, 2022

Plz, will you add curl-dev to the docker? :)

It is added #172 (comment). It isn't working?

@lobis
Copy link
Member

lobis commented Mar 31, 2022

I dont know, this was suggested by @Vindaar I am just following. Of course, we could have another method TRestTools::GETRequest but I am not sure about the benefits or issues of using one or the other method.

I this case we should use whatever is easier I think. The already existing method uses wget, we should have a single dependency to perform this type of requests, so either move TRestTools::DownloadRemoteFile to use curl, or use wget to perform the post request in this new method. Since it looks like we're already using wget (even though its not an official dependency, we just call it from system but...) I think we shoult not include curl and just use wget

@lobis
Copy link
Member

lobis commented Mar 31, 2022

I added this PR #173 which adds cpr as a CMake FetchContent dependency (so user does not have to install anything). I think the library provides a much better syntax for requests. However there is a problem with linking the library... I think its due to the COMPILELIB macro.

@jgalan jgalan requested review from lobis and a team March 31, 2022 14:33
@jgalan jgalan marked this pull request as ready for review March 31, 2022 14:33
@Vindaar
Copy link
Member

Vindaar commented Apr 1, 2022

Also why don't we use http GET with query parameters to retrieve the data instead of POST?

My solution is simply reverse engineered by looking at the network communication the browser does when using the tool. Initially I tried a GET with parameters in my code as well, but always got an error from the server. Feel free to try it yourself (i.e. just using curl in the command line), would love to see a simpler solution.

@nkx111
Copy link
Member

nkx111 commented Apr 1, 2022

Why not calling curl from system() as TRestTools::DownloadRemoteFile() does? Then no extra dependency needs to be added.

@Vindaar
Copy link
Member

Vindaar commented Apr 1, 2022

Why not calling curl from system() as TRestTools::DownloadRemoteFile() does? Then no extra dependency needs to be added.

While this doesn't add a dependency to REST, it makes the whole functionality more brittle. Makes the code more platform dependent, but more importantly we now have an implicit dependency. While curl is extremely ubiquitous, I don't think it's a good idea to depend on it being installed.

@nkx111
Copy link
Member

nkx111 commented Apr 1, 2022

While curl is extremely ubiquitous, I don't think it's a good idea to depend on it being installed.

I don't quite understand. Isn't that the current implementation including curl header also dependent on curl being installed?

@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

While curl is extremely ubiquitous, I don't think it's a good idea to depend on it being installed.

I don't quite understand. Isn't that the current implementation including curl header also dependent on curl being installed?

No, curl command might be installed/compiled as a binary, but you need the development, usually a separate package curl-dev or curl-devel ...

Strange, no? This didnt happen on Gentoo where you compile everything :)

@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

@Vindaar How it would look like the system call?

@Vindaar
Copy link
Member

Vindaar commented Apr 1, 2022

@Vindaar How it would look like the system call?

Sorry, what are you asking?

@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

@Vindaar How it would look like the system call?

Sorry, what are you asking?

Actually, what would be the full curl command that takes in the argument the url and a POST request with key:value arguments, and writes the result to a file just as DownloadRemoteFile does with wget.

@Vindaar
Copy link
Member

Vindaar commented Apr 1, 2022

@Vindaar How it would look like the system call?

Sorry, what are you asking?

Actually, what would be the full curl command that takes in the argument the url and a POST request with key:value arguments, and writes the result to a file just as DownloadRemoteFile does with wget.

Ahh, I see.

The POST request:

curl -d "Layer=Au&Ldensity=-1&Thick=250.0&Sigma1=0.0&Substrate=SiO2&Sdensity=-1&Sigma2=0.0&Pol=0&Scan=Energy&Min=7495.0&Max=15000.0&Npts=499&temp=Angle+%28deg%29&Fixed=0.95&Plot=LinLog&Output=Plot" \
-X POST "https://henke.lbl.gov/cgi-bin/laymir.pl"

That returns the HTML. Use your favorite way to extract the HREF from the resulting HTML (if done on the command line xmllint is probably the best choice; probably better done in code using custom parsing. As the HTML is so simple and doesn't change, a find the <a HREF=", then parse until "> is fine).

Then just GET:

curl "https://henke.lbl.gov/cgi-bin/<extracted_link>"

@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

Ok, everyone happy with the system option? @lobis also?

@Vindaar
Copy link
Member

Vindaar commented Apr 1, 2022

Ok, everyone happy with the system option? @lobis also?

As I alluded to above, I don't think relying on an installed binary is a good idea. Rather deal with the dependency once at compilation stage of REST and be certain that a compiled REST is able to fetch the files than pray that the dependency is there at runtime and deal with the fallout of that.

But feel free to go with it.

@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

Ok, everyone happy with the system option? @lobis also?

As I alluded to above, I don't think relying on an installed binary is a good idea. Rather deal with the dependency once at compilation stage of REST and be certain that a compiled REST is able to fetch the files than pray that the dependency is there at runtime and deal with the fallout of that.

But feel free to go with it.

Right, I also prefer to deal with the installation of the library if needed at compilation time, for me just a cmake REQUIRED curl would suffice.

It is also true that some time ago I tried to compile GDL the GNU version of IDL, and I gave up because I had to disable 100 dependencies. It is true that a 100 dependencies is not good.

@juanangp
Copy link
Member

juanangp commented Apr 1, 2022

Ok, everyone happy with the system option? @lobis also?

I am personally against any system call. I think we should actually remove any system call from the code. See for instance http://www.cplusplus.com/articles/j3wTURfi/

@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

Ok, so thats the new commit I added, it does not use a system call.

@lobis
Copy link
Member

lobis commented Apr 1, 2022

I also don't like system calls but I think it's better than relying on adding new user requirements. In my opinion the best would be to use the CPR library installed via cmake fetch content, but there are some bugs I couldn't fix.

@Vindaar
Copy link
Member

Vindaar commented Apr 1, 2022

Why not use something like cpp-httplib?

@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

I need the updates at this PR at master, the new methods added are operative and documented, could you please approve and create an issue about upgrading DownloadRemoteFile and POSTRequest methods to avoid using system and curl/curl.h?

From my point of view we will survive till the previous issue is solved.

@lobis
Copy link
Member

lobis commented Apr 1, 2022

Why not use something like cpp-httplib?

I was suggesting https://github.com/libcpr/cpr but https://github.com/yhirose/cpp-httplib seems also fine.

Copy link
Member

@lobis lobis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: remove libcurl requirement for the user

@jgalan jgalan requested review from a team and juanangp April 1, 2022 15:01
@jgalan
Copy link
Member Author

jgalan commented Apr 1, 2022

TODO: remove libcurl requirement for the user

Probably this TODO is lost if not in an issue

@jgalan jgalan merged commit 1d60c10 into master Apr 1, 2022
@jgalan jgalan deleted the jgalan_tools_post branch April 1, 2022 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants