Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iNat import #1955

Open
5 of 10 tasks
mo-nathan opened this issue Feb 22, 2024 · 16 comments · May be fixed by #2150
Open
5 of 10 tasks

iNat import #1955

mo-nathan opened this issue Feb 22, 2024 · 16 comments · May be fixed by #2150
Assignees

Comments

@mo-nathan
Copy link
Member

mo-nathan commented Feb 22, 2024

Tasks

  • Questions:
    • General organization
    • Better names for classes, methods, variables, ex: ImportedInatObs
    • Mapping iNat obs locations: ignore? use smallest existing MO loc? map some to MO exact equivalents, ex: "Lane County, OR, USA"?; adopt iNat scheme (separate PR).
    • iNat observations/photos with restrictive licenses (e.g. nonderivs, Options: Don't import; require MO user consent).

Done??

  • Import single iNat Observation (in foreground, unauthenticated)
  • Isolate call to MO API instead of conditioning images on non-test environment.
  • Change API Key to something temporary that I can delete after fixing environment variables
  • Add production MO API Key for webmaster
  • ?? Slime molds: Better solution than iconic_taxon_name: "Protozoa"? probably too complicated: ("Slime molds" are polyphyletic).
@JoeCohen
Copy link
Member

JoeCohen commented Mar 22, 2024

iNat API documents:

@JoeCohen
Copy link
Member

JoeCohen commented Mar 22, 2024

To play with API requests:

  • Go to iNaturalist API
  • Scroll down to the list of objects
  • Click on an object (e.g. Observations)
  • Click an Operation (e.g. GET Observations aka Observation Search)

The very simple API request for a single iNat Observation (202555552). It returns tons of data.
https://api.inaturalist.org/v1/observations?id=202555552

@JoeCohen
Copy link
Member

JoeCohen commented Mar 23, 2024

Plan: start with a foreground job that imports a single iNat Observation without authentication.
Later:

  • Import multiple observations
  • Do it in the background
  • Deal with authentication. Possibly wait until spring mushroom season, add a bunch of easily improvable Observations to iNat, then apply for an application.

@JoeCohen JoeCohen changed the title Import observation from iNat into MO iNat import Mar 26, 2024
@JoeCohen JoeCohen self-assigned this Mar 26, 2024
@JoeCohen
Copy link
Member

JoeCohen commented Apr 20, 2024

Photos
They're in aws.
pseudocode:

obs[:observation_photos].each do |photo|
  aws_id = photo[:photo_id]
  import this: https://inaturalist-open-data.s3.amazonaws.com/photos/<aws_id>/original.jpeg
end

to get image to local tmp file (per Copilot)

require 'open-uri'

url = 'http://example.com/image.png' # Replace with the actual image URL
local_file_path = Rails.root.join('tmp', 'image.png') # Specify the local file path

# Download the image and save it locally
IO.copy_stream(open(url), local_file_path)

# Now `local_file_path` contains the downloaded image

Tempfile Class: The Tempfile class in Ruby provides a convenient way to manage temporary files. When you create a Tempfile object, it automatically generates a unique filename in the OS’s temp directory. You can perform standard file operations on it, such as reading, writing, and changing permissions. Here’s how you can use it:

require 'tempfile'

file = Tempfile.new('foo')
file_path = file.path # A unique filename in the OS's temp directory
file.write('hello world')
file.rewind
content = file.read
file.close
file.unlink # Deletes the temp file

@JoeCohen
Copy link
Member

JoeCohen commented May 3, 2024

How do I copy an image from an external website to a (temp) file on the server?

Nathan Wilson nathan@collectivesource.com Sun, Apr 21, 2024 at 5:04 AM
...
At least leveraging the code in the API makes sense to me. Not sure it's worth creating a service that just translates between the APIs, but it would be cool and might get us to at least improve the online documentation around our API. I heard at last year's NAMA foray that ChatGPT is better at creating code using the iNat API than the MO API. I think Alan has looked at that and might have some examples.

On Sun, Apr 21, 2024 at 12:19 AM Jason Hollinger <pellaea@gmail.com> wrote:
Yes, check out around line 58 of app/classes/api2/core/uploads.rb for one possible solution.
It might be altogether a dumb idea to literally just use the API to accomplish your whole task.

@nimmolo
Copy link
Contributor

nimmolo commented May 3, 2024 via email

@JoeCohen
Copy link
Member

JoeCohen commented May 4, 2024

From README_API.md

The response will include the id of the new record.

Attach the image as POST data or URL. See script/test_api for an example of how to attach an image in the POST data.

There is no script/test_api.
Maybe test/controllers/api2_controller_test.rb#test_post_maximal_image?

  def test_post_maximal_image
    setup_image_dirs
    rolf.update(keep_filenames: "keep_and_show")
    rolf.reload
    file = Rails.root.join("test/images/Coprinus_comatus.jpg").to_s
    proj = rolf.projects_member.first
    obs = rolf.observations.first
    File.stub(:rename, false) do
      post_and_send_file(:images, file, "image/jpeg",
                         api_key: api_keys(:rolfs_api_key).key,
                         vote: "3",
                         date: "20120626",
                         notes: " Here are some notes. ",
                         copyright_holder: "My Friend",
                         license: licenses(:ccnc30).id.to_s,
                         original_name: "Coprinus_comatus.jpg",
                         projects: proj.id,
                         observations: obs.id)
    end
    ...
  end

  def post_and_send_file(action, file, content_type, params)
    body = Rack::Test::UploadedFile.new(file, "image/jpeg").read
    md5sum = file_checksum(file)
    post_and_send(action, body, content_type, md5sum, params)
  end

  def post_and_send(action, body, content_type, md5sum, params)
    @request.env["CONTENT_TYPE"] = content_type
    @request.env["CONTENT_MD5"] = md5sum
    post(action, params: params, body: body)
  end

Better: test/models/api2_test.rb#test_posting_image_via_url

  def test_posting_image_via_url
    setup_image_dirs
    url = "https://mushroomobserver.org/images/thumb/459340.jpg"
    stub_request(:any, url).
      to_return(Rails.root.join("test/images/test_image.curl").read)
    params = {
      method: :post,
      action: :image,
      api_key: @api_key.key,
      upload_url: url
    }
    File.stub(:rename, false) do
      api = API2.execute(params)
      assert_no_errors(api, "Errors while posting image")
      img = Image.last
      assert_obj_arrays_equal([img], api.results)
      actual = File.read(img.local_file_name(:full_size))
      expect = Rails.root.join("test/images/test_image.jpg").read
      assert_equal(expect, actual, "Uploaded image differs from original!")
    end
  end

@JoeCohen
Copy link
Member

JoeCohen commented May 4, 2024

Following is returns by API help request http://localhost:3000//api2/images?help=1

{
"version": 2,
"run_date": "2024-05-04T11:40:53.075Z",
"errors": [
{
"code": "API2::HelpMessage",
"details": "Usage: confidence: confidence range (limit=-3..3); content_type: enum list (limit=bmp|gif|jpg|png|raw|tiff); copyright_holder_has: string (search within copyright holder); created_at: time range; date: date range (when photo taken); has_notes: boolean; has_observation: boolean (limit=true, is attached to an observation?); has_votes: boolean; id: integer list; include_subtaxa: boolean; include_synonyms: boolean; license: license; location: location list; name: name list; notes_has: string (search within notes); observation: observation list; ok_for_export: boolean; project: project list; quality: quality range (limit=1..4); size: enum (limit=huge|large|medium|small|thumbnail, width or height at least 160 for thumbnail, 320 for small, 640 for medium, 960 for large, 1280 for huge); species_list: species_list list; updated_at: time range; user: user list (who uploaded the photo)",
"fatal": "true"
}
],
"run_time": 0.017291
}

@JoeCohen JoeCohen linked a pull request May 19, 2024 that will close this issue
@JoeCohen
Copy link
Member

JoeCohen commented Jun 1, 2024

Is iNat Obs in Fungi?

Taxa page for Fungi: https://www.inaturalist.org/taxa/47170-Fungi_
So [:taxon][:ancestor_ids] must include 47170
alternative: [:taxon][:iconic_taxon_name] == "Fungi"

@JoeCohen
Copy link
Member

JoeCohen commented Jun 13, 2024

iNat Projects Issue.

I cannot reliably get an iNat observation's Projects via the new API.
Ex: iNat 216745568 shows many projects via the UI.
But the corresponding fixture has nothing. See test/fixtures/inat/gyromitra_ancilis.txt

The old API might work. https://www.inaturalist.org/pages/api+reference#get-observations
Also see this CoPIlot response (about the UI):

As of now, iNaturalist provides a way to see which collection or umbrella projects include an observation. However, this feature is not available for traditional projects. If you're interested in finding out which projects an observation belongs to, here's how you can do it:

  1. Collection and Umbrella Projects:
  • Starting from the observation details page, you can now see the collection and umbrella projects that include an observation. This feature was added based on user requests. Simply navigate to the observation detail page, and you'll find the relevant information there.
  1. Traditional Projects:
    • Unfortunately, for traditional projects, there isn't a direct way to see which projects an observation qualifies for. The information about whether an observation belongs to a traditional project is not readily available in the observation data or CSV downloads.
    • If you're interested in this feature, consider raising it as a feature request on the iNaturalist platform. It would be valuable for users to know which projects their observations could potentially be part of.

Remember that iNaturalist is continually evolving, so keep an eye out for any updates or new features that might address this need! 🌿🔍

Source: Conversation with Copilot, 6/13/2024
(1) Updates to collection and umbrella projects · iNaturalist. https://www.inaturalist.org/blog/18375-updates-to-collection-and-umbrella-projects.
(2) Find out to which projects an observation has been added. https://forum.inaturalist.org/t/find-out-to-which-projects-an-observation-has-been-added/3236.
(3) Adding Observations to a Traditional Project - iNaturalist Community Forum. https://forum.inaturalist.org/t/adding-observations-to-a-traditional-project-wiki/13190.

@JoeCohen
Copy link
Member

for contains_box, I need to have a (separate?) scope which generalizes this (from #2183)

  scope :contains, # Use named parameters (lat:, lng:), any order
        lambda { |**args|
          args => {lat:, lng:}
          where(
            Location[:south].lteq(lat).and(Location[:north].gteq(lat)).
            and(
              Location[:west].lteq(lng).and(Location[:east].gteq(lng)).or(
                Location[:west].gteq(lng).and(Location[:east].lteq(lng))
              )
            )
          )
        }

@nimmolo
Copy link
Contributor

nimmolo commented Jun 28, 2024

How about something like this, that takes advantage of named arg assignment:

  scope :contains, # Use named parameters (lat:, lng:, or north:, south:, east:, west:), any order
        lambda { |**args|
          if args.lat.present? && args.lng.present?
            args => {lat:, lng:}
            north = south = lat
            east = west = lng
          else if args.north.present?
            args => {north:, south:, east:, west:}
          end
          where(
            Location[:south].lteq(south).and(Location[:north].gteq(north)).
            and(
              Location[:west].lteq(west).and(Location[:east].gteq(east)).or(
                Location[:west].gteq(west).and(Location[:east].lteq(east))
              )
            )
          )
        }

Not sure about all that, but something like that.

@JoeCohen
Copy link
Member

@nimmolo Yes. That looks right. Needs some tests, especially for e/w.
What's the best procedure for adding this:

@nimmolo
Copy link
Contributor

nimmolo commented Jun 28, 2024

I'd say do it as a standalone PR vs main. We have plenty of scopes as yet unused, and one more won't hurt — plus of course I'll use the lat/lng part in my next PR.

@JoeCohen
Copy link
Member

JoeCohen commented Jun 28, 2024 via email

This was referenced Jul 8, 2024
@MushroomObserver MushroomObserver deleted a comment from mo-nathan Jul 8, 2024
@JoeCohen
Copy link
Member

iNat API Recommended Practices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants