# 01: Explore Goodreads Romance JSON Files

**Objective**: Systematic exploration of romance-related Goodreads JSON files to understand data structure, quality, and content before analysis.

**Research Context**: Analyze how thematic characteristics of modern romance novels relate to reader engagement/popularity using Goodreads metadata.

**Target Subgenres**: contemporary, historical, paranormal, romantic suspense, romantic fantasy, science-fiction romance

**Engagement Metrics**: ratings, ratings_count, reviews_count, review text signals, shelf/status signals ("to-read", "currently-reading")

## Analysis Plan
1. **Pre-flight Manifest Scan** - Detect file presence, sizes, compression status
2. **Schema Inventory** - Field names, examples, dtypes, null percentages
3. **Key Coverage Analysis** - book_id, work_id, author_id mapping
4. **Subgenre Signal Discovery** - Popular shelves, genre tags, classification signals
5. **Engagement Metrics Feasibility** - Rating distributions, review counts, shelf signals
6. **Final Column Contract Draft** - Define target schema for analysis
7. **Quality Filters Dry-Run** - Test filtering criteria on samples
8. **Synthesis** - Summary and next steps

## Expected Outputs
- File manifest with sizes and record estimates
- Schema inventory per file with field types and quality metrics
- Key field coverage analysis
- Subgenre classification feasibility assessment
- Engagement metrics availability report
- Final column contract for analysis dataset

In [10]:
# PLAN: Pre-flight manifest scan - detect file presence, sizes, compression, record estimates
# FILES: Read from data/raw/, write to logs/exploration/ and data/intermediate/manifests/
# EXPECTED: File manifest with sizes, compression status, estimated records, missing files
# TIMESTAMP: 2024-01-XX HH:MM:SS

import os
import gzip
import json
from pathlib import Path
from datetime import datetime
import hashlib
import sys
from collections import defaultdict

# FIXED: Proper project root detection - go up from notebooks/ to project root
notebook_dir = Path().absolute()
project_root = notebook_dir.parent  # Go up one level from notebooks/ to project root

# Validate we're in the right place
if not (project_root / "data" / "raw").exists():
    # Try alternative detection if we're already in project root
    if (notebook_dir / "data" / "raw").exists():
        project_root = notebook_dir
    else:
        raise FileNotFoundError(f"Cannot find data/raw/ directory. Current: {notebook_dir}, Project root: {project_root}")

print(f"‚úÖ Project root detected: {project_root}")
print(f"‚úÖ Data raw directory: {project_root / 'data' / 'raw'}")
print(f"‚úÖ Data raw exists: {(project_root / 'data' / 'raw').exists()}")

# Add project root to path for imports
if str(project_root) not in sys.path:
    sys.path.append(str(project_root))

# Create required directories
log_full_dir = project_root / "logs" / "exploration" / "full"
log_summary_dir = project_root / "logs" / "exploration" / "summary"
manifest_dir = project_root / "data" / "intermediate" / "manifests"

for dir_path in [log_full_dir, log_summary_dir, manifest_dir]:
    dir_path.mkdir(parents=True, exist_ok=True)

# Generate run ID
run_id = datetime.now().strftime("%Y%m%d_%H%M%S")
start_time = datetime.now()

# Expected files (based on actual data/raw contents)
expected_files = [
    "goodreads_books_romance.json.gz",
    "goodreads_interactions_romance.json.gz",
    "goodreads_reviews_romance.json.gz",
    "goodreads_book_authors.json.gz",
    "goodreads_book_works.json.gz",
    "goodreads_book_genres_initial.json.gz",
    "goodreads_reviews_dedup.json.gz",
    "goodreads_reviews_spoiler.json.gz",
    "goodreads_book_series.json.gz"
]

data_raw_dir = project_root / "data" / "raw"
manifest_data = {
    "run_id": run_id,
    "timestamp": start_time.isoformat(),
    "environment": {
        "python_version": sys.version,
        "platform": sys.platform,
        "project_root": str(project_root),
        "notebook_dir": str(notebook_dir),
        "data_raw_dir": str(data_raw_dir)
    },
    "files": {}
}

print(f"=== PRE-FLIGHT MANIFEST SCAN ===")
print(f"Run ID: {run_id}")
print(f"Start Time: {start_time}")
print(f"Project Root: {project_root}")
print(f"Data Raw Directory: {data_raw_dir}")
print(f"Python Version: {sys.version}")
print(f"Platform: {sys.platform}")
print(f"Scanning {len(expected_files)} expected files...")
print("=" * 60)

# Scan each expected file
for filename in expected_files:
    file_path = data_raw_dir / filename
    
    if file_path.exists():
        # Get file size
        file_size = file_path.stat().st_size
        file_size_mb = file_size / (1024 * 1024)
        
        # Estimate record count by sampling first 1000 lines
        record_count_estimate = None
        try:
            with gzip.open(file_path, 'rt', encoding='utf-8') as f:
                lines = []
                for i, line in enumerate(f):
                    if i >= 1000:  # Sample first 1000 lines
                        break
                    lines.append(line.strip())
                
                # Count valid JSON lines
                valid_lines = 0
                for line in lines:
                    if line.strip() and line.strip() != '[' and line.strip() != ']':
                        try:
                            json.loads(line.rstrip(','))
                            valid_lines += 1
                        except json.JSONDecodeError:
                            continue
                
                # Estimate total records
                if valid_lines > 0:
                    # Get total file size and estimate total lines
                    with gzip.open(file_path, 'rt', encoding='utf-8') as f2:
                        total_lines = sum(1 for _ in f2)
                    
                    # Estimate records based on valid JSON ratio
                    valid_ratio = valid_lines / len(lines)
                    record_count_estimate = int(total_lines * valid_ratio)
        except Exception as e:
            print(f"ERROR: Could not estimate records for {filename}: {e}")
        
        # Store file info
        file_info = {
            "status": "found",
            "size_bytes": file_size,
            "size_mb": round(file_size_mb, 2),
            "compression": "gzip",
            "record_count_estimate": record_count_estimate,
            "sampling_method": "first_1000_lines"
        }
        
        print(f"‚úÖ {filename}")
        print(f"   Size: {file_size_mb:.2f} MB")
        print(f"   Compression: gzip")
        if record_count_estimate:
            print(f"   Estimated Records: {record_count_estimate:,}")
        else:
            print(f"   Estimated Records: Could not determine")
        
    else:
        file_info = {
            "status": "missing",
            "error": "File not found"
        }
        print(f"‚ùå {filename} - FILE NOT FOUND")
    
    manifest_data["files"][filename] = file_info
    print()

# Calculate summary statistics
found_files = [f for f, info in manifest_data["files"].items() if info["status"] == "found"]
missing_files = [f for f, info in manifest_data["files"].items() if info["status"] == "missing"]
total_size_mb = sum(info["size_mb"] for info in manifest_data["files"].values() if info["status"] == "found")
total_records = sum(info.get("record_count_estimate", 0) for info in manifest_data["files"].values() if info["status"] == "found")

manifest_data["summary"] = {
    "files_found": len(found_files),
    "files_missing": len(missing_files),
    "total_size_mb": round(total_size_mb, 2),
    "total_records_estimate": total_records,
    "missing_files": missing_files
}

end_time = datetime.now()
duration = end_time - start_time
manifest_data["duration_seconds"] = duration.total_seconds()

print("=" * 60)
print(f"SCAN SUMMARY")
print(f"Files Found: {len(found_files)}/{len(expected_files)}")
print(f"Files Missing: {len(missing_files)}")
print(f"Total Size: {total_size_mb:.2f} MB")
print(f"Total Records (Estimate): {total_records:,}")
print(f"Duration: {duration.total_seconds():.2f} seconds")
print(f"End Time: {end_time}")

if missing_files:
    print(f"MISSING FILES (will continue with available files):")
    for file in missing_files:
        print(f"   - {file}")

# Save detailed log
log_full_path = log_full_dir / f"{run_id}_preflight_manifest.log"
with open(log_full_path, 'w') as f:
    f.write(f"=== PRE-FLIGHT MANIFEST SCAN LOG ===")
    f.write(f"Run ID: {run_id}")
    f.write(f"Start Time: {start_time}")
    f.write(f"End Time: {end_time}")
    f.write(f"Duration: {duration.total_seconds():.2f} seconds")
    f.write(f"Project Root: {project_root}")
    f.write(f"Notebook Dir: {notebook_dir}")
    f.write(f"Data Raw Dir: {data_raw_dir}")
    f.write(f"Python Version: {sys.version}")
    f.write(f"Platform: {sys.platform}")
    
    f.write(f"FILE DETAILS:")
    for filename, info in manifest_data["files"].items():
        f.write(f"{filename}:")
        for key, value in info.items():
            f.write(f"  {key}: {value}")
    
    f.write(f"SUMMARY:")
    for key, value in manifest_data["summary"].items():
        f.write(f"  {key}: {value}")

# Save summary log
log_summary_path = log_summary_dir / f"{run_id}_preflight_summary.log"
with open(log_summary_path, 'w') as f:
    f.write(f"Pre-flight Manifest Scan Summary")
    f.write(f"Run ID: {run_id}")
    f.write(f"Files Found: {len(found_files)}/{len(expected_files)}")
    f.write(f"Files Missing: {len(missing_files)}")
    f.write(f"Total Size: {total_size_mb:.2f} MB")
    f.write(f"Total Records (Estimate): {total_records:,}")
    f.write(f"Duration: {duration.total_seconds():.2f} seconds")
    if missing_files:
        f.write(f"Missing Files: {', '.join(missing_files)}")

# Save JSON manifest
manifest_path = manifest_dir / f"manifest_{run_id}.json"
with open(manifest_path, 'w') as f:
    json.dump(manifest_data, f, indent=2)

print(f"üìÅ OUTPUTS SAVED:")
print(f"   Full Log: {log_full_path}")
print(f"   Summary Log: {log_summary_path}")
print(f"   JSON Manifest: {manifest_path}")

# LOG: End time, metrics, outputs
print(f"=== STEP 1 COMPLETE ===")
print(f"End Time: {end_time}")
print(f"Files Processed: {len(expected_files)}")
print(f"Files Found: {len(found_files)}")
print(f"Files Missing: {len(missing_files)}")
print(f"Total Data Size: {total_size_mb:.2f} MB")
print(f"Estimated Total Records: {total_records:,}")
print(f"Outputs: 3 files saved")

‚úÖ Project root detected: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research
‚úÖ Data raw directory: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research/data/raw
‚úÖ Data raw exists: True
=== PRE-FLIGHT MANIFEST SCAN ===
Run ID: 20250820_001037
Start Time: 2025-08-20 00:10:37.839626
Project Root: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research
Data Raw Directory: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research/data/raw
Python Version: 3.9.18 (main, Jan 29 2025, 16:13:49) 
[GCC 13.3.0]
Platform: linux
Scanning 9 expected files...
‚úÖ goodreads_books_romance.json.gz
   Size: 347.88 MB
   Compression: gzip
   Estimated Records: 335,449

‚úÖ goodreads_interactions_romance.json.gz
   Size: 2186.71 MB
   Compression: gzip
   Estimated Records: 42,792,856

‚úÖ goodreads_reviews_romance.json.gz
   Size: 1240.85 MB
   Compression: gzip
   Estimated Records: 3

## üìö Utility Functions

This cell contains utility functions adapted from the Python modules for better file handling and data analysis.

In [11]:
# ============================================================================
# UTILITY FUNCTIONS (Adapted from src/utils/lightweight_handlers.py)
# ============================================================================

import json
import gzip
from pathlib import Path
from typing import Dict, List, Any, Iterator, Optional, Union
import logging
from datetime import datetime
from collections import defaultdict, Counter

# Set up basic logging for notebook
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)


class NotebookJSONHandler:
    """Enhanced JSON file handler for notebook use."""
    
    def __init__(self, chunk_size: int = 1000):
        self.chunk_size = chunk_size
    
    def read_json_gz(self, file_path: Path, max_records: Optional[int] = None) -> Iterator[Dict[str, Any]]:
        """Read a gzipped JSON file line by line with progress tracking."""
        if not file_path.exists():
            print(f"‚ùå File not found: {file_path}")
            return
            
        print(f"üìñ Reading JSON file: {file_path.name}")
        
        records_read = 0
        try:
            with gzip.open(file_path, 'rt', encoding='utf-8') as f:
                for line_num, line in enumerate(f):
                    if max_records and records_read >= max_records:
                        break
                        
                    try:
                        record = json.loads(line.strip())
                        yield record
                        records_read += 1
                        
                        # Progress indicator
                        if records_read % 1000 == 0:
                            print(f"   Processed {records_read:,} records...")
                        
                    except json.JSONDecodeError as e:
                        if records_read < 10:  # Only show first few errors
                            print(f"‚ö†Ô∏è  JSON decode error at line {line_num}: {e}")
                        continue
                        
        except Exception as e:
            print(f"‚ùå Error reading file {file_path}: {e}")
            raise
            
        print(f"‚úÖ Read {records_read:,} records from {file_path.name}")
    
    def sample_records(self, file_path: Path, sample_size: int = 1000) -> List[Dict[str, Any]]:
        """Sample records from a gzipped JSON file."""
        if not file_path.exists():
            return []
            
        records = []
        try:
            with gzip.open(file_path, 'rt', encoding='utf-8') as f:
                for line in f:
                    if len(records) >= sample_size:
                        break
                    try:
                        record = json.loads(line.strip())
                        records.append(record)
                    except json.JSONDecodeError:
                        continue
        except Exception as e:
            print(f"‚ùå Error sampling records from {file_path}: {e}")
            return []
            
        return records


class NotebookSchemaInspector:
    """Schema inspector adapted for notebook use."""
    
    def __init__(self):
        self.handler = NotebookJSONHandler()
    
    def inspect_file_schema(self, file_path: Path, sample_size: int = 1000) -> Dict[str, Any]:
        """Inspect the schema of a JSON file."""
        if not file_path.exists():
            print(f"‚ùå File not found: {file_path}")
            return {}
            
        print(f"üîç Inspecting schema for: {file_path.name}")
        
        # Sample records
        records = self.handler.sample_records(file_path, sample_size)
        if not records:
            print(f"‚ö†Ô∏è  No valid records found in {file_path.name}")
            return {}
            
        # Analyze schema
        field_types = defaultdict(set)
        field_values = defaultdict(set)
        missing_values = defaultdict(int)
        field_lengths = defaultdict(list)
        
        for record in records:
            for field_name, field_value in record.items():
                # Track field types
                field_types[field_name].add(type(field_value).__name__)
                
                # Track field lengths
                if isinstance(field_value, str):
                    field_lengths[field_name].append(len(field_value))
                elif isinstance(field_value, list):
                    field_lengths[field_name].append(len(field_value))
                elif isinstance(field_value, dict):
                    field_lengths[field_name].append(len(field_value))
                
                # Track unique values (limit to avoid memory issues)
                if isinstance(field_value, (str, int, float)) and len(field_values[field_name]) < 100:
                    field_values[field_name].add(str(field_value))
                
                # Track missing values
                if field_value is None or field_value == "":
                    missing_values[field_name] += 1
        
        # Compile schema information
        schema_info = {
            "file_name": file_path.name,
            "file_path": str(file_path),
            "file_size_mb": file_path.stat().st_size / (1024 * 1024),
            "records_analyzed": len(records),
            "fields": {},
            "sample_records": records[:3] if records else []
        }
        
        for field_name in field_types:
            schema_info["fields"][field_name] = {
                "types": list(field_types[field_name]),
                "missing_count": missing_values[field_name],
                "missing_percentage": (missing_values[field_name] / len(records)) * 100,
                "unique_values_count": len(field_values[field_name]),
                "sample_values": list(field_values[field_name])[:10]
            }
            
            # Add length statistics
            if field_lengths[field_name]:
                lengths = field_lengths[field_name]
                schema_info["fields"][field_name]["length_stats"] = {
                    "min_length": min(lengths),
                    "max_length": max(lengths),
                    "avg_length": sum(lengths) / len(lengths)
                }
        
        return schema_info


# Utility functions
def save_json_artifact(data: Dict[str, Any], file_path: Path) -> None:
    """Save data as JSON artifact."""
    try:
        with open(file_path, 'w') as f:
            json.dump(data, f, indent=2)
        print(f"üíæ Saved artifact: {file_path.name}")
    except Exception as e:
        print(f"‚ùå Error saving artifact {file_path}: {e}")


print("‚úÖ Utility functions loaded successfully!")

‚úÖ Utility functions loaded successfully!


## üîç Step 2: Schema Inventory

**Objective**: Analyze field structure, data types, and quality metrics for each JSON file.

**Process**: Sample records from each file to understand:
- Field names and data types
- Missing value percentages
- Sample values and value distributions
- Data quality indicators

**Output**: Comprehensive schema report for each file.

In [12]:
# ============================================================================
# STEP 2: SCHEMA INVENTORY
# ============================================================================

# PLAN: Analyze field structure, data types, missing values, sample data
# FILES: Read from data/raw/, write to logs/exploration/ and data/intermediate/schemas/
# EXPECTED: Schema inventory per file with field types and quality metrics
# TIMESTAMP: 2024-01-XX HH:MM:SS

import os
import gzip
import json
from pathlib import Path
from datetime import datetime
import sys
from collections import defaultdict

# FIXED: Proper project root detection - go up from notebooks/ to project root
notebook_dir = Path().absolute()
project_root = notebook_dir.parent  # Go up one level from notebooks/ to project root

# Validate we're in the right place
if not (project_root / "data" / "raw").exists():
    # Try alternative detection if we're already in project root
    if (notebook_dir / "data" / "raw").exists():
        project_root = notebook_dir
    else:
        raise FileNotFoundError(f"Cannot find data/raw/ directory. Current: {notebook_dir}, Project root: {project_root}")

print(f"‚úÖ Project root detected: {project_root}")
print(f"‚úÖ Data raw directory: {project_root / 'data' / 'raw'}")
print(f"‚úÖ Data raw exists: {(project_root / 'data' / 'raw').exists()}")

# Add project root to path for imports
if str(project_root) not in sys.path:
    sys.path.append(str(project_root))

# Create required directories
log_full_dir = project_root / "logs" / "exploration" / "full"
log_summary_dir = project_root / "logs" / "exploration" / "summary"
schema_dir = project_root / "data" / "intermediate" / "schemas"

for dir_path in [log_full_dir, log_summary_dir, schema_dir]:
    dir_path.mkdir(parents=True, exist_ok=True)

# Generate run ID
run_id = datetime.now().strftime("%Y%m%d_%H%M%S")
start_time = datetime.now()

# Expected files (from Step 1 results)
expected_files = [
    "goodreads_books_romance.json.gz",
    "goodreads_interactions_romance.json.gz",
    "goodreads_reviews_romance.json.gz",
    "goodreads_book_authors.json.gz",
    "goodreads_book_works.json.gz",
    "goodreads_book_genres_initial.json.gz",
    "goodreads_reviews_dedup.json.gz",
    "goodreads_reviews_spoiler.json.gz",
    "goodreads_book_series.json.gz"
]

data_raw_dir = project_root / "data" / "raw"
schema_data = {
    "run_id": run_id,
    "timestamp": start_time.isoformat(),
    "environment": {
        "python_version": sys.version,
        "platform": sys.platform,
        "project_root": str(project_root),
        "notebook_dir": str(notebook_dir),
        "data_raw_dir": str(data_raw_dir)
    },
    "files": {}
}

print(f"=== SCHEMA INVENTORY ANALYSIS ===")
print(f"Run ID: {run_id}")
print(f"Start Time: {start_time}")
print(f"Project Root: {project_root}")
print(f"Data Raw Directory: {data_raw_dir}")
print(f"Python Version: {sys.version}")
print(f"Platform: {sys.platform}")
print(f"\nAnalyzing schema for {len(expected_files)} files...")
print("=" * 60)

# Initialize schema inspector
inspector = NotebookSchemaInspector()

# Analyze each file
for filename in expected_files:
    file_path = data_raw_dir / filename
    
    if file_path.exists():
        print(f"\nüîç Analyzing: {filename}")
        
        # Analyze schema (sample 1000 records for efficiency)
        schema_info = inspector.inspect_file_schema(file_path, sample_size=1000)
        
        if schema_info:
            schema_data["files"][filename] = schema_info
            
            # Display summary
            print(f"   ‚úÖ Fields: {len(schema_info['fields'])}")
            print(f"   ‚úÖ Records Analyzed: {schema_info['records_analyzed']:,}")
            print(f"   ‚úÖ File Size: {schema_info['file_size_mb']:.2f} MB")
            
            # Show key fields (first 5)
            field_names = list(schema_info['fields'].keys())[:5]
            print(f"   üìã Key Fields: {', '.join(field_names)}")
            
            # Show data quality summary
            high_missing = [f for f, data in schema_info['fields'].items() 
                           if data['missing_percentage'] > 50]
            if high_missing:
                print(f"   ‚ö†Ô∏è  High Missing Fields: {', '.join(high_missing[:3])}")
        else:
            print(f"   ‚ùå Failed to analyze schema")
    else:
        print(f"\n‚ùå {filename} - FILE NOT FOUND")

# Calculate summary statistics
analyzed_files = [f for f, info in schema_data["files"].items()]
total_fields = sum(len(info['fields']) for info in schema_data["files"].values())
total_records_analyzed = sum(info['records_analyzed'] for info in schema_data["files"].values())

schema_data["summary"] = {
    "files_analyzed": len(analyzed_files),
    "total_fields_discovered": total_fields,
    "total_records_analyzed": total_records_analyzed,
    "files_analyzed": analyzed_files
}

end_time = datetime.now()
duration = end_time - start_time
schema_data["duration_seconds"] = duration.total_seconds()

print("\n" + "=" * 60)
print(f"SCHEMA ANALYSIS SUMMARY")
print(f"Files Analyzed: {len(analyzed_files)}/{len(expected_files)}")
print(f"Total Fields Discovered: {total_fields}")
print(f"Total Records Analyzed: {total_records_analyzed:,}")
print(f"Duration: {duration.total_seconds():.2f} seconds")
print(f"End Time: {end_time}")

# Save detailed schema report
schema_path = schema_dir / f"schema_analysis_{run_id}.json"
save_json_artifact(schema_data, schema_path)

# Save summary log
log_summary_path = log_summary_dir / f"{run_id}_schema_summary.log"
summary_content = f"Schema Analysis Summary\n"
summary_content += f"Run ID: {run_id}\n"
summary_content += f"Files Analyzed: {len(analyzed_files)}/{len(expected_files)}\n"
summary_content += f"Total Fields: {total_fields}\n"
summary_content += f"Total Records Analyzed: {total_records_analyzed:,}\n"
summary_content += f"Duration: {duration.total_seconds():.2f} seconds\n"

with open(log_summary_path, 'w') as f:
    f.write(summary_content)

print(f"\nüìÅ OUTPUTS SAVED:")
print(f"   Schema Analysis: {schema_path}")
print(f"   Summary Log: {log_summary_path}")

# LOG: End time, metrics, outputs
print(f"\n=== STEP 2 COMPLETE ===")
print(f"End Time: {end_time}")
print(f"Files Analyzed: {len(analyzed_files)}")
print(f"Total Fields Discovered: {total_fields}")
print(f"Total Records Analyzed: {total_records_analyzed:,}")
print(f"Outputs: 2 files saved")

‚úÖ Project root detected: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research
‚úÖ Data raw directory: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research/data/raw
‚úÖ Data raw exists: True
=== SCHEMA INVENTORY ANALYSIS ===
Run ID: 20250820_001816
Start Time: 2025-08-20 00:18:16.808377
Project Root: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research
Data Raw Directory: /home/polina/Documents/goodreads_romance_research_cursor/romance-novel-nlp-research/data/raw
Python Version: 3.9.18 (main, Jan 29 2025, 16:13:49) 
[GCC 13.3.0]
Platform: linux

Analyzing schema for 9 files...

üîç Analyzing: goodreads_books_romance.json.gz
üîç Inspecting schema for: goodreads_books_romance.json.gz
   ‚úÖ Fields: 29
   ‚úÖ Records Analyzed: 1,000
   ‚úÖ File Size: 347.88 MB
   üìã Key Fields: isbn, text_reviews_count, series, country_code, language_code
   ‚ö†Ô∏è  High Missing Fields: isbn, asin, isbn13

ü