-
-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
Problem Statement
Users often struggle to construct the right filter criteria for their downloads, leading to either downloading too much (wasting bandwidth) or too little (missing important files). There's no way to preview what will be downloaded before starting.
Current User Pain
# Trial and error approach - frustrating and wasteful
filters = FilterCriteria(
include_patterns=["*.py", "*.js"], # Too broad? Too narrow?
exclude_patterns=["test_*"], # Missing other test patterns?
max_file_size=100000 # Is this the right size?
)
# No way to know what you'll get until after download
result = await downloader.download("tensorflow/tensorflow", "./tf", filters=filters)
print(f"Oops, downloaded {len(result.downloaded_files)} files, expected ~50")
Desired Enhanced Experience
# Interactive download builder with preview
builder = await downloader.create_download_builder("tensorflow/tensorflow")
# Preview without downloading
preview = await builder.preview(
include_patterns=["*.py"],
exclude_patterns=["test_*", "*_test.py"]
)
print(f"📊 Preview Results:")
print(f" Files to download: {preview.file_count}")
print(f" Total size: {preview.total_size_mb:.1f} MB")
print(f" Estimated time: {preview.estimated_duration}")
print(f" Top directories:")
for dir_info in preview.top_directories:
print(f" {dir_info.path}: {dir_info.file_count} files ({dir_info.size_mb:.1f} MB)")
# Interactive refinement
if preview.file_count > 1000:
# Suggest refinements
suggestions = builder.get_filter_suggestions(target_file_count=500)
print(f"💡 Suggestions to reduce files:")
for suggestion in suggestions:
print(f" - {suggestion.description}")
print(f" Filter: {suggestion.filter_change}")
print(f" Result: ~{suggestion.estimated_files} files")
# Apply refinements and preview again
refined_preview = await builder.preview(
include_patterns=["src/**/*.py", "core/**/*.py"],
exclude_patterns=["**/*test*", "**/tests/**"],
max_file_size=50000
)User Benefits
- Confidence: Know exactly what you're downloading before starting
- Efficiency: Avoid trial-and-error filter construction
- Learning: Understand repository structure before downloading
- Sharing: Save and share proven filter configurations
- Speed: Faster iteration on filter criteria
Metadata
Metadata
Assignees
Labels
No labels