Skip to content

Conversation

@CHERRY-ui8
Copy link
Contributor

  • Add NCBISearch searcher for DNA/GenBank/Entrez database search
  • Add RNACentralSearch searcher for RNA database search
  • Update search_all.py to support ncbi and rnacentral data sources
  • Add search configs for DNA, RNA, and protein (renamed from search_config)
  • Add search scripts for DNA and RNA
  • Add demo input files for DNA and RNA search
  • Update search_uniprot.sh to use search_protein_config.yaml

This PR extends the search functionality to support biological data types:

  • Protein search (existing, now explicitly named)
  • DNA search via NCBI
  • RNA search via RNAcentral

- Add NCBISearch searcher for DNA/GenBank/Entrez database search
- Add RNACentralSearch searcher for RNA database search
- Update search_all.py to support ncbi and rnacentral data sources
- Add search configs for DNA, RNA, and protein (renamed from search_config)
- Add search scripts for DNA and RNA
- Add demo input files for DNA and RNA search
- Update search_uniprot.sh to use search_protein_config.yaml

This PR extends the search functionality to support biological data types:
- Protein search (existing, now explicitly named)
- DNA search via NCBI
- RNA search via RNAcentral
@gemini-code-assist
Copy link

Important

Installation incomplete: to start using Gemini Code Assist, please ask the organization owner(s) to visit the Gemini Code Assist Admin Console and sign the Terms of Services.

- Remove all trailing whitespace
- Fix line-too-long issues (split long lines)
- Remove trailing newlines at end of files
- Remove unnecessary else/elif after return statements
- Extract utility functions (_gene_record_to_dict, _accession_to_dict, _rna_data_to_dict)
- Unify method naming: search_by_keyword -> get_best_hit
- Add threshold parameter to NCBI and RNAcentral searchers for interface consistency
- Improve error handling with network error detection and fallback strategies
- Fix RNAcentral sequence search to prioritize exact matches
- Add search_rna_demo.jsonl example file
@CHERRY-ui8 CHERRY-ui8 force-pushed the feat/add-dna-rna-search branch from f1bca32 to 8678e33 Compare November 27, 2025 16:55
@CHERRY-ui8 CHERRY-ui8 force-pushed the feat/add-dna-rna-search branch from 6016dfe to 40ef49e Compare November 27, 2025 17:08
CHERRY-ui8 and others added 5 commits November 29, 2025 22:25
- Refactor search_by_sequence in ncbi_searcher.py to reduce return statements from 7 to 1
- Refactor search_by_sequence in rnacentral_searcher.py to reduce return statements from 8 to 1 and branches from 16 to 12
- Extract helper methods to improve code readability and maintainability
- Fix pylint errors R0911 (too-many-return-statements) and R0912 (too-many-branches)
@CHERRY-ui8 CHERRY-ui8 force-pushed the feat/add-dna-rna-search branch from bfa8a2f to d767096 Compare November 30, 2025 18:03
@ChenZiHong-Gavin ChenZiHong-Gavin merged commit ac22fdc into Intern-Science:main Dec 1, 2025
3 checks passed
@ChenZiHong-Gavin ChenZiHong-Gavin deleted the feat/add-dna-rna-search branch December 1, 2025 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants