Skip to content

Add musuems_victoria_fetch.py#216

Closed
sbarhin wants to merge 3 commits intocreativecommons:mainfrom
sbarhin:whiz
Closed

Add musuems_victoria_fetch.py#216
sbarhin wants to merge 3 commits intocreativecommons:mainfrom
sbarhin:whiz

Conversation

@sbarhin
Copy link
Copy Markdown

@sbarhin sbarhin commented Oct 29, 2025

This PR implements automation for Museum Victoria data fetching as discussed in issue #215. The implementation follows the established patterns from existing fetch scripts.
This purpose of this file is to fetch all the records from the Museum Victoria API, then saving the necessary response fields needed for the next phase (processing phase).

Fixes

Description

  • Fetches data for all record types (article, item, specimen, species) from the Museum Victoria API
  • Prepares and saves meaningful responses into a csv file under the data/2025Q4/1-fetch directory
  • Next actions will be to process and report the data once the fetching script is approved by reviewers

Checklist

  • I have read and understood the Developer Certificate of Origin (DCO), below, which covers the contents of this pull request (PR).
  • My pull request doesn't include code or content generated with AI.
  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the default branch of the repository (main or master).
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no
    visible errors.

Developer Certificate of Origin

For the purposes of this DCO, "license" is equivalent to "license or public domain dedication," and "open source license" is equivalent to "open content license or public domain dedication."

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@sbarhin sbarhin requested review from a team as code owners October 29, 2025 09:03
@sbarhin sbarhin requested review from TimidRobot and possumbilities and removed request for a team October 29, 2025 09:03
@cc-open-source-bot cc-open-source-bot moved this to In review in TimidRobot Oct 29, 2025
@TimidRobot TimidRobot self-assigned this Oct 30, 2025
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove these changes from the pull request (PR). Automation will be implemented after this PR is merged.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbarhin it is unhelpful to mark a conversation resolved when the pull request (PR) hasn't been updated with a resolution. Please don't.

@TimidRobot
Copy link
Copy Markdown
Member

  • I added or updated tests for the changes I made (if applicable).

I don't see any added or updated tests. Why did you check this item?

  • I added or updated documentation (if applicable).

I don't see any added or updated documentation. Why did you check this item?

Comment on lines +163 to +165
except requests.exceptions.RequestException as e:
LOGGER.error(f"Error fetching page {current_page} for {record_type}: {e}")
break
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please model exception handling on existing scripts:

except requests.HTTPError as e:
raise shared.QuantifyingException(f"HTTP Error: {e}", 1)
except requests.RequestException as e:
raise shared.QuantifyingException(f"Request Exception: {e}", 1)
except KeyError as e:
raise shared.QuantifyingException(f"KeyError: {e}", 1)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbarhin it is unhelpful to mark a conversation resolved when the pull request (PR) hasn't been updated with a resolution. Please don't.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize. I will take note next time.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be sure to run static analysis before committing

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please in the pre-commit-config.yaml, can I update the python version to 3.13?


def initialize_data_file(file_path, header):
if not os.path.isfile(file_path):
with open(file_path, "w", newline="") as file_obj:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'MEDIA JSON': sanitize_string(media_json_string),
}
initialize_data_file(FILE_RECORDS, HEADER_RECORDS)
with open(FILE_RECORDS, "a", newline="") as file:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename to match convention.

The script should be executable:

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimidRobot Hi there, please can you explain further on "renaming to match convention"?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at how the fetch scripts are name in the main branch. Match that convention.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I name it museums_fetch.py? I guess adding the victoria made it quite longer

@TimidRobot
Copy link
Copy Markdown
Member

@sbarhin Please pay better attention to the documentation, instructions, and the conventions set by the existing scripts.

"--enable-save",
action="store_true",
help="Enable saving results",
default="true"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As demonstrated in the existing scripts, this shouldn't default to true.

The scripts should run as safely as possible without any command line options.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbarhin it is unhelpful to mark a conversation resolved when the pull request (PR) hasn't been updated with a resolution. Please don't.

"--enable-git",
action="store_true",
help="Enable git actions (fetch, merge, add, commit, and push)",
default="true"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As demonstrated in the existing scripts, this shouldn't default to true.

The scripts should run as safely as possible without any command line options.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbarhin it is unhelpful to mark a conversation resolved when the pull request (PR) hasn't been updated with a resolution. Please don't.

@TimidRobot
Copy link
Copy Markdown
Member

@sbarhin good work implementing their API

@TimidRobot TimidRobot changed the title Added musuems_victoria_fetch.py Add musuems_victoria_fetch.py Oct 30, 2025
@sbarhin sbarhin closed this Oct 30, 2025
@github-project-automation github-project-automation bot moved this from In review to Done in TimidRobot Oct 30, 2025
@sbarhin
Copy link
Copy Markdown
Author

sbarhin commented Oct 30, 2025

@TimidRobot Please it seems this PR is closed. What could be the issue?

@oree-xx
Copy link
Copy Markdown
Contributor

oree-xx commented Oct 30, 2025

@sbarhin It says you closed it?

@sbarhin
Copy link
Copy Markdown
Author

sbarhin commented Oct 30, 2025

@sbarhin It says you closed it?

Oops. Could I have done that mistakenly? I was pushing the changes in relation to the reviews. I later found that it says I closed it

@sbarhin
Copy link
Copy Markdown
Author

sbarhin commented Oct 30, 2025

@oree-xx @TimidRobot Please what's the way forward now

@oree-xx
Copy link
Copy Markdown
Contributor

oree-xx commented Oct 30, 2025

@sbarhin You would need @TimidRobot to look into it.
I was looking at this tho https://stackoverflow.com/questions/53735216/reopen-a-pull-request-on-github
But I think the maintainer can resolve it. Just wait a little.

@sbarhin
Copy link
Copy Markdown
Author

sbarhin commented Oct 30, 2025

@sbarhin You would need @TimidRobot to look into it. I was looking at this tho https://stackoverflow.com/questions/53735216/reopen-a-pull-request-on-github But I think the maintainer can resolve it. Just wait a little.

Okay, thank you.

@sbarhin
Copy link
Copy Markdown
Author

sbarhin commented Oct 30, 2025

I had issues changing the git credentials. I will create a new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants