# ADS Parser Testing Notebook

This notebook tests the ADS parser functionality including:
- Connection testing
- Paper information retrieval with abstracts
- Dedicated abstract retrieval function
- Testing with bibcodes from WUMaCat.csv

## Features Available:
- `test_ads_connection()` - Tests API connectivity
- `get_paper_info(bibcode, show_abstract=True)` - Full paper information
- `get_abstract(bibcode)` - Dedicated abstract retrieval


In [3]:
# Setup and imports
import sys
import os

# Add the src directory to the path
sys.path.append('../src')

# Import our ADS parser functions
from ads_parser import test_ads_connection, get_paper_info, get_abstract

print("‚úÖ Libraries imported successfully")
print("‚úÖ ADS parser functions available")
print("‚úÖ Ready for testing!")


‚úÖ Libraries imported successfully
‚úÖ ADS parser functions available
‚úÖ Ready for testing!


In [4]:
# Test 1: ADS API Connection
print("üîç Testing ADS API connection...")
print("="*40)

connection_success = test_ads_connection()

if connection_success:
    print("\n‚úÖ Connection test passed! Ready to retrieve abstracts.")
else:
    print("\n‚ùå Connection test failed! Please check your ADS API token.")


üîç Testing ADS API connection...
üîç Testing ADS API connection...
‚úÖ ADS API connection successful!
   Found 1103 total results
   Retrieved 1 documents

‚úÖ Connection test passed! Ready to retrieve abstracts.


In [5]:
# Test 2: Abstract Retrieval with get_abstract()
test_bibcode = "2020AJ....159..189L"  # From WUMaCat.csv

print("üìÑ Testing dedicated abstract retrieval function")
print("="*60)
print(f"Bibcode: {test_bibcode}")
print()

abstract = get_abstract(test_bibcode)

if abstract:
    print(f"\nüéâ Abstract successfully retrieved!")
    print(f"Length: {len(abstract)} characters")
else:
    print("\n‚ùå Failed to retrieve abstract")


üìÑ Testing dedicated abstract retrieval function
Bibcode: 2020AJ....159..189L

üìÑ Retrieving abstract for bibcode: 2020AJ....159..189L
‚úÖ Abstract retrieved successfully
   Title: The First Light Curve Modeling and Orbital Period Change Investigation of Nine Contact Binaries around the Short-period Cutoff
   Abstract length: 1402 characters

üìÑ Abstract:
------------------------------------------------------------
In this paper, we present the first light curve synthesis and orbital period change analysis of nine contact binaries around the short-period limit. It is found that all these systems are W-subtype contact binaries. One of them is a medium contact system while the others are shallow contact ones. Four of them manifest obvious O'Connell effect explained by a dark spot or hot spot on one of the component stars. Third light was detected in three systems. By investigating orbital period variations, we found that four of the targets display a secular period decrease while t

In [6]:
# Test 3: Paper Info with Abstract Control
print("üìÑ Testing get_paper_info() with abstract options")
print("="*60)

# Test with abstract display enabled (default)
print("1. With abstract display enabled:")
paper_info = get_paper_info(test_bibcode, show_abstract=True)

print("\n" + "="*60)

# Test with abstract display disabled
print("2. With abstract display disabled:")
paper_info_no_abstract = get_paper_info(test_bibcode, show_abstract=False)

if paper_info and 'abstract' in paper_info:
    print(f"\nüìä Abstract data available in returned dictionary:")
    print(f"   Abstract length: {len(paper_info['abstract'])} characters")
else:
    print("\n‚ùå No abstract data in returned dictionary")


üìÑ Testing get_paper_info() with abstract options
1. With abstract display enabled:
üîç Retrieving paper information for bibcode: 2020AJ....159..189L
‚úÖ Successfully retrieved paper information
   Title: The First Light Curve Modeling and Orbital Period Change Investigation of Nine Contact Binaries around the Short-period Cutoff
   Authors: Li, Kai, Kim, Chun-Hwey, Xia, Qi-Qi, Michel, Raul, Hu, Shao-Ming, Gao, Xing, Guo, Di-Fu, Chen, Xu
   Year: 2020
   Journal: The Astronomical Journal
   Abstract: In this paper, we present the first light curve synthesis and orbital period change analysis of nine contact binaries around the short-period limit. It is found that all these systems are W-subtype contact binaries. One of them is a medium contact system while the others are shallow contact ones. Four of them manifest obvious O'Connell effect explained by a dark spot or hot spot on one of the component stars. Third light was detected in three systems. By investigating orbital period var

In [4]:
# Test 5: Bulk Bibcode Requests (NEW FEATURE)
# Import the new bulk function
from ads_parser import get_bulk_paper_info

# Test with multiple bibcodes from WUMaCat.csv
bulk_test_bibcodes = [
    "2020AJ....159..189L",  # 1SWASP J003033.05+574347.6
    "2018NewA...59....8S",  # 1SWASP J011732.10+525204.9  
    "2015AJ....150..117Q",  # 1SWASP J015100.23-100524.2
    "2015NewA...41...22J",  # Test another one
    "2018Ap&SS.363...15L"   # And one more
]

print("üìö Testing BULK bibcode retrieval")
print("="*60)
print(f"Testing with {len(bulk_test_bibcodes)} bibcodes")
print("Recommended batch size: 50-100 bibcodes per request")
print()

# Test bulk retrieval with small batch size for demo
bulk_results = get_bulk_paper_info(bulk_test_bibcodes, show_abstracts=False, batch_size=3)

if bulk_results:
    print(f"\nüìä Bulk Results Summary:")
    print(f"   Successfully retrieved: {len(bulk_results)} papers")
    print(f"   Available bibcodes: {list(bulk_results.keys())}")
else:
    print("\n‚ùå Bulk retrieval failed")


üìö Testing BULK bibcode retrieval
Testing with 5 bibcodes
Recommended batch size: 50-100 bibcodes per request

üîç Retrieving information for 5 papers in batches of 3

üì¶ Processing batch 1/2 (3 bibcodes)
   Rate limit remaining: 4986
   ‚úÖ Retrieved 3 papers from batch

üì¶ Processing batch 2/2 (2 bibcodes)
   Rate limit remaining: 4985
   ‚úÖ Retrieved 2 papers from batch

üéâ Bulk retrieval completed! Retrieved 5/5 papers

üìä Bulk Results Summary:
   Successfully retrieved: 5 papers
   Available bibcodes: ['2020AJ....159..189L', '2015AJ....150..117Q', '2018NewA...59....8S', '2018Ap&SS.363...15L', '2015NewA...41...22J']


In [None]:
# Test 6: Download All Catalogue Abstracts to JSON
from ads_parser import download_catalogue_abstracts

# Test with a small sample first (first 10 papers)
print("üß™ TESTING: Download abstracts for first 10 papers from WUMaCat.csv")
print("="*70)

# This will download abstracts and save to JSON
# Format: {"papers": {"bibcode": {"title": "...", "abstract": "..."}}}

# Uncomment the line below to run the full catalogue download:
# results = download_catalogue_abstracts("../data/WUMaCat.csv", "wumacat_abstracts.json", batch_size=50)

print("üí° To run the full download, uncomment the line above")
print("üí° This will download ~689 papers in batches of 50")
print("üí° Estimated time: ~20-30 minutes (with 2-second delays)")
print("üí° Output: JSON file with title and abstract for each bibcode")
