### **Introduction to `argparse`:**
`argparse` is a built-in Python module for parsing command-line arguments. It makes it easy to write user-friendly command-line interfaces for your Python programs. By using `argparse`, you can create scripts that accept input from the command line, making your programs more flexible and interactive.

The **`argparse`** module helps you:

- Define what arguments your program expects.
- Specify whether arguments are optional or required.
- Handle different types of arguments (e.g., integers, strings, booleans).
- Provide helpful error messages and usage instructions if the user inputs incorrect or missing arguments.

### **Key Features of `argparse`:**
1. **Argument Parsing:** Automatically parses command-line arguments and passes them to your script.
2. **Argument Validation:** You can define which arguments are required and what type of values they should accept.
3. **Help and Documentation:** `argparse` provides built-in help messages when the user runs your script with the `--help` flag.
4. **Default Values:** You can set default values for optional arguments, making your script more flexible.

---

### **Basic Example:**
Let's look at a basic example where we use `argparse` to handle a simple command-line argument (`--url`) for a web scraping program.

```python
import argparse

def main():
    # Create an ArgumentParser object
    parser = argparse.ArgumentParser(description="A simple web scraper.")

    # Add an argument for the target URL
    parser.add_argument("--url", required=True, help="The URL to scrape")

    # Parse the arguments
    args = parser.parse_args()

    # Access the URL argument
    print(f"Scraping URL: {args.url}")

if __name__ == "__main__":
    main()
```

### **How to Use the Script:**
Once you've saved this script (let's call it `scraper.py`), you can run it from the command line like this:
```bash
python scraper.py --url https://example.com
```

In [11]:
import argparse
import sys

In [12]:
def greetings():
    print("hello worlds")

In [13]:
""" 
Objective: Create a very basic of running python code with arguments
"""
# TODO: Modify greetings to print name instead of 'worlds'
import argparse
# Your argparse-based script
def main():
    # TODO: Create Argument Parser object
    parser = argparse.ArgumentParser(description='Greeting program')

    # TODO: Add argument to accept name input
    parser.add_argument('--name', help='Name to greet')

    # TODO: Parse argument input using Argument Parser object, 
    args = parser.parse_args()

    # store it in a variable
    # TODO: Called name input from parser as greetings parameter
    print('Hello', args.name, "nice to meet you")

# Simulate command-line arguments
# This equals to "python script.py --name 'Udin Salahudin'"
sys.argv = ['script.py', '--name', 'Khilmi Aminudin']
# Or if you run through a different file
# you can just copy and paste here and ignore the sys code line

# Execute the script
if __name__ == "__main__":
    main()

Hello Khilmi Aminudin nice to meet you


In [14]:
""" 
Objective: Understanding options parameter on an argument 
such as default, required, type, and help
"""
# TODO: You can use your previous code
# TODO: Add default parameter in your argument
# TODO: Add required parameter
# TODO: Explore any other parameters such as type and help

import argparse
import sys

# Your argparse-based script
def main() -> None:
    # TODO: Create Argument Parser object
    parser = argparse.ArgumentParser(description='Greeting program')

    # TODO: Add argument to accept name input
    parser.add_argument('--name', help='your name', required=True)

    parser.add_argument('--age', help='your age', type=int)

    # TODO: Parse argument input using Argument Parser object, 
    args = parser.parse_args()

    # store it in a variableq2 fnj 
    # TODO: Called name input from parser as greetings parameter
    print('Hello', args.name,'age', args.age, "nice to meet you")

sys.argv = ['script.py', '--name', 'Khilmi Aminudin', '--age', '25']

# Execute the script
if __name__ == "__main__":
    main()


Hello Khilmi Aminudin age 25 nice to meet you


In [15]:
""" 
Objective: Understanding argument as choices
"""
# TODO: Add new argument in choices mode
# TODO: Add default value for the choices

def main() -> None:
    # TODO: Create Argument Parser object
    parser = argparse.ArgumentParser(description='Greeting program')

    # TODO: Add argument to accept name input
    parser.add_argument('-n', '--name', help='your name', required=True)

    parser.add_argument('--age', help='your age', type=int)

    parser.add_argument('--activty', help='your activty', default='code')

    # TODO: Parse argument input using Argument Parser object, 
    args = parser.parse_args()

    # store it in a variableq2 fnj 
    # TODO: Called name input from parser as greetings parameter
    print('Hello', args.name, "your age now is", args.age, "nice to meet you")
    print('Your activity is', args.activty)

sys.argv = ['script.py', '-n', 'Jhon Doe', '--age', '25', '--activty', 'sleep']

if __name__ == "__main__":
    main()



Hello Jhon Doe your age now is 25 nice to meet you
Your activity is sleep


In [16]:
""" 
Objective: Understanding argument as action variable
"""
# TODO: Add new argument that act as action
# TODO: Compare on how program executed
def main() -> None:
    # TODO: Create Argument Parser object
    parser = argparse.ArgumentParser(description='Greeting program')

    # TODO: Add argument to accept name input
    parser.add_argument('-n', '--name', help='your name', required=True)
    
    parser.add_argument('-v', '--validate', help='validate status of your data', action='store_true')

    # TODO: Parse argument input using Argument Parser object, 
    args = parser.parse_args()

    # store it in a variableq2 fnj 
    # TODO: Called name input from parser as greetings parameter
    print('Hello', args.name, "nice to meet you")
    print('validation status is', args.validate)

sys.argv = ['script.py', '-n', 'Jhon Doe', '-v'] # make validate status is True
# sys.argv = ['script.py', '-n', 'Jhon Doe'] # make validate status is False

if __name__ == "__main__":
    main()


Hello Jhon Doe nice to meet you
validation status is True


In [None]:
""" 
Objective: Implement in Web Scraping
"""
# TODO: Create a function to make your scraping run dynamically as user input
# TODO: Add argument for output format, filename, and page limit

import argparse
import logging
import sys
import requests
import json
import csv
from bs4 import BeautifulSoup as bs




logging.basicConfig(level=logging.DEBUG, force=True, filename="book_toscrape.log")

def save_to_file(output:str, filename:str , data:list[dict]) -> None:
    # print(f"output: {output}, filename: {filename}, data: {len(data)}")
    if output == "json":
        with open(f"{filename}.{output}", "w", encoding="utf-8") as f:
            json.dump(data, f,ensure_ascii=False, indent=4)
    elif output == "csv":
        with open(f"{filename}.{output}", "w") as f:
            writer = csv.DictWriter(f, fieldnames=data[0].keys())
            writer.writeheader()
            writer.writerows(data)
    else:
        logging.error("error invalid output file format")
        return

    logging.info("succes save to file")
        


def main() -> None:
    parser = argparse.ArgumentParser(description='Web Scraping program')

    parser.add_argument('--url', help='URL to scrape', required=True)
    parser.add_argument('--output', help='Output format', choices=['csv', 'json'], default='csv')
    parser.add_argument('--filename', help='Output filename', default='output')
    parser.add_argument('--limit', help='Page limit', type=int, default=1)

    args = parser.parse_args()

    logging.info(f"SCRAPING on URL: {args.url}, OUTPUT: {args.output}, FILENAME: {args.filename}, LIMIT: {args.limit}")

    # Your scraping code here
    logging.info("START SCRAPING")

    data: list[dict] = []

    for i in range(1, args.limit+1):
        url = args.url
        if i > 1:
            url= f"{args.url}/catalogue/page-{i}.html"
        
        logging.info(f"SCRAPING PAGE: {i} of {args.limit} pages url: {url}")

        try:
            response = requests.get(url)
            response.raise_for_status()

            soup = bs(response.text, 'html.parser')
            books = soup.find_all('article', class_='product_pod')

            for book in books:
                title = book.h3.a.get_text()
                price = book.find('p', class_='price_color').get_text()
                book_url = book.h3.a['href']
                book_url = f"{args.url}{book_url}" if i == 1 else f"{args.url}catalogue/{book_url}"

                data.append({
                    'title': title,
                    'price': price,
                    'book_url': book_url,
                    'cover_image_url': None,
                    'product_description':None,
                    'upc': None,
                    'product_type': None,
                    'price_before_tax': None,
                    'price_after_tax': None,
                    'tax': None,
                    'stock': None,
                    'number_of_reviews': None
                })

                logging.info(f"TITLE: {title}, PRICE: {price}, BOOK URL: {book_url}")
        except requests.exceptions.RequestException as e:
            logging.error(f"Error: {e}")

        logging.info(f"PAGE {i} DONE")
        
    
    logging.info("START GET DETAIL BOOK DATA")
    for d in data:
        try:
            logging.info(f'get detail book {d.get('title')}')
            response = requests.get(d.get('book_url'))
            response.raise_for_status()

            soup = bs(response.text, "html.parser")

            p_tag = soup.find_all('p')
            product_description = p_tag[3].get_text()

            img_url = soup.find('img')['src'].replace('../..', args.url)

            table = soup.find('table', class_='table table-striped')
            td_data = table.find_all('td') 

            d['product_description'] = product_description
            d['cover_image_url'] = img_url
            d['upc'] = td_data[0].get_text()
            d['product_type'] = td_data[1].get_text()
            d['price_before_tax'] = td_data[2].get_text()
            d['price_after_tax'] = td_data[3].get_text()
            d['tax'] = td_data[4].get_text()
            d['stock'] = td_data[5].get_text().replace('In stock (','').replace(' available)','')
            d['number_of_reviews'] = td_data[6].get_text()

            logging.info(f'success get detail book {d.get('title')}')
        except requests.exceptions.RequestException as e:
            logging.info(f'failed get detail book {d.get('title')}')
            logging.error(f"Error: {e}")

    logging.info("DONE GET DETAIL BOOK DATA")

    logging.info("SAVE TO FILE")
    save_to_file(args.output, args.filename ,data)
    logging.info("SAVE TO FILE DONE")

    logging.info("SCRAPING DONE")


# sys.argv = ['script.py', '--url', 'https://books.toscrape.com/', '--output', 'json', '--filename', 'output', '--limit', '50']
sys.argv = ['script.py', '--url', 'https://books.toscrape.com/', '--output', 'json', '--filename', 'output', '--limit', '5']

if __name__ == "__main__":
    main()

### **Reflection**
Argparse offers flexibility to your code. Do you think its better to create a separate code with similiar functionality or a code should be flexible?

I think I should separate the code because for the ease of the future if the code has started to complex it will make it easy to maintain the code

### **Exploration**
Typer builds on top of argparse and offers a more modern, Pythonic, and user-friendly approach. 