
1. What is Shell Scripting?


Concept:

Shell scripting means automating tasks you’d usually do manually in a terminal—listing files, copying, renaming, etc.

Python is now the language of choice for automation across all major OSes.

Explanation:

Lists all files and folders in the current directory, alphabetically, one per line.

Use Case:

Quickly inventory a working directory (e.g., before batch-processing files in a data pipeline, or after extracting a dataset).


In [9]:
import os
print("Files and folders:")
for f in sorted(os.listdir('.')):
    print(f)


Files and folders:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
sample.txt


2. Shell, Kernel, and Terminal

Concept:

In traditional scripting, the shell acts as the go-between for the user and the kernel (OS core).

With Python, you interact directly with system APIs.

Explanation:

Identifies the running Python interpreter and OS.

Use Case:

Ensure scripts run correctly in heterogeneous environments (Windows, Linux, macOS)—vital for reproducibility in research or deployment.

In [10]:
import sys
print("Python executable:", sys.executable)
print("Platform:", sys.platform)


Python executable: c:\Users\jim\AppData\Local\Programs\Python\Python311\python.exe
Platform: win32


3. Elements of Shell Scripting in Python


Concept:

Automate routine filesystem operations—make, list, rename, and delete directories.

Explanation:

Creates and deletes a directory, showing the directory list after each step.

Use Case:

Set up temporary directories for processing large files, or tear down staging folders in CI/CD workflows.

In [11]:
import os

os.mkdir('testdir')
print("After mkdir:")
for f in sorted(os.listdir('.')):
    print(f)
os.rmdir('testdir')
print("After rmdir:")
for f in sorted(os.listdir('.')):
    print(f)


After mkdir:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
sample.txt
testdir
After rmdir:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
sample.txt




Concept:

Python scripts are cross-platform, easily readable, and integrate seamlessly with scientific libraries and data tools.


Explanation:

Prints working directory, files, and the current date.

Use Case:

Timestamp automated log files, or track which version of data/scripts were used in batch runs for compliance or audit trails.

In [12]:
import os
import datetime

print("Current working directory:", os.getcwd())
print("Files here:")
for f in sorted(os.listdir('.')):
    print(f)
print("Today is:", datetime.date.today())


Current working directory: c:\Users\jim\OneDrive\Desktop\AMJ Group\Teaching\Class Materials\Dell_GenAI\Applied_Data_Science_With_Python_ILT_Materials_Sept_24\Instructor_Materials\0.1_Instructor_Slides_and_Notebooks\Lesson_04_Working_With_Pandas
Files here:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
sample.txt
Today is: 2025-08-01


The sys Module


Concept:

The sys module gives you access to system-level details and allows scripts to accept command-line arguments for flexible automation.


Explanation:

Shows how to read arguments (for batch processing multiple files), check interpreter version, and use stdin for piped data.

Use Case:

Build parameterized scripts for ETL jobs or pass runtime config for ML/data pipelines.

In [13]:
import sys

print("Arguments passed:", sys.argv)
print("Python version:", sys.version)
print("Standard input object:", sys.stdin)


Arguments passed: ['C:\\Users\\jim\\AppData\\Roaming\\Python\\Python311\\site-packages\\ipykernel_launcher.py', '--f=c:\\Users\\jim\\AppData\\Roaming\\jupyter\\runtime\\kernel-v37a181e4061ce03fce78018e8d59893fa8915c6b9.json']
Python version: 3.11.9 (tags/v3.11.9:de54cf5, Apr  2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]
Standard input object: <_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>


The os Module


Concept:

The os module is the Swiss army knife for Python automation—handle files, directories, permissions, and more.

Explanation:

Sequentially creates, renames, and removes a directory, always presenting the output in an organized, readable way.

Use Case:

Automate the creation of project folders, cleanup after data imports, or prepare directory trees for batch job outputs.

In [14]:
import os

print("Before mkdir:")
for f in sorted(os.listdir('.')):
    print(f)

os.mkdir('example_dir')
print("\nAfter mkdir:")
for f in sorted(os.listdir('.')):
    print(f)

os.rename('example_dir', 'renamed_dir')
print("\nAfter rename:")
for f in sorted(os.listdir('.')):
    print(f)

os.rmdir('renamed_dir')
print("\nAfter rmdir:")
for f in sorted(os.listdir('.')):
    print(f)


Before mkdir:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
sample.txt

After mkdir:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
example_dir
housing_data.csv
log_2025-08-01.txt
sample.txt

After rename:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
renamed_dir
sample.txt

After rmdir:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipyn

The subprocess Module


Concept:

subprocess lets Python control other programs—run system commands, collect outputs, or chain tools together (as in Unix pipes).

Explanation:

Runs a shell command and neatly prints its output and exit status.

Use Case:

Integrate legacy tools or command-line utilities into Python workflows, or trigger OS-level jobs from inside data pipelines.

In [16]:
import subprocess

# Use 'dir' for Windows. Also, shell=True is needed for built-in commands.
result = subprocess.run('dir', capture_output=True, text=True, shell=True)
print("Directory listing (dir):")
for line in result.stdout.strip().split('\n'):
    print(line)
print("Exit code:", result.returncode)



Directory listing (dir):
Volume in drive C is OS
 Volume Serial Number is 749F-39EB

 Directory of c:\Users\jim\OneDrive\Desktop\AMJ Group\Teaching\Class Materials\Dell_GenAI\Applied_Data_Science_With_Python_ILT_Materials_Sept_24\Instructor_Materials\0.1_Instructor_Slides_and_Notebooks\Lesson_04_Working_With_Pandas

08/01/2025  08:48 AM    <DIR>          .
07/31/2025  06:53 PM    <DIR>          ..
08/01/2025  05:01 AM            18,164 4.01_Introduction_to_Pandas.ipynb
08/01/2025  06:16 AM            56,683 4.02_Pandas_DataFrame.ipynb
08/01/2025  06:16 AM            52,029 4.03_Date_and_TimeDelta_in_Pandas.ipynb
08/01/2025  06:15 AM           146,644 4.04_Working_with_Text_Data.ipynb
06/05/2024  05:00 AM           526,795 HousePrices.csv
06/05/2024  05:00 AM           477,205 housing_data.csv
06/05/2024  04:59 AM         3,073,159 Lesson_04_Knowledge_Checks.pptx
08/01/2025  08:29 AM               348 log_2025-08-01.txt
08/01/2025  08:48 AM            35,863 Python_Shell.01.08.ipynb
08/

Reading and Writing Files


Concept:

File I/O is foundational: create, read, update, or append to data and log files as part of any data pipeline or system monitoring.

Explanation:

Writes and then reads a file, printing each line cleanly.

Use Case:

Generate audit logs, save ETL checkpoints, or read configs for jobs.

In [17]:
# Write and read a file with organized output
with open('sample.txt', 'w') as f:
    f.write('Line one\nLine two\nPython scripting!')

print("Contents of sample.txt:")
with open('sample.txt') as f:
    for line in f:
        print(line.strip())


Contents of sample.txt:
Line one
Line two
Python scripting!


Miscellaneous File Operations: shutil


Concept:

Use shutil for copying, moving, and deleting files and directories—especially for workflow automation or safe archiving.

Explanation:

Demonstrates copy, move, and delete, with output after each action.

Use Case:

Stage intermediate files, implement safe ETL "move-then-delete," or organize nightly data backups.

In [18]:
import shutil, os

with open('file1.txt', 'w') as f:
    f.write('Sample data')

shutil.copy2('file1.txt', 'file2.txt')
print("After copy:")
for f in sorted(os.listdir('.')):
    print(f)

shutil.move('file2.txt', 'moved_file.txt')
print("\nAfter move:")
for f in sorted(os.listdir('.')):
    print(f)

os.remove('moved_file.txt')
os.remove('file1.txt')
print("\nAfter cleanup:")
for f in sorted(os.listdir('.')):
    print(f)


After copy:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
file1.txt
file2.txt
housing_data.csv
log_2025-08-01.txt
sample.txt

After move:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
file1.txt
housing_data.csv
log_2025-08-01.txt
moved_file.txt
sample.txt

After cleanup:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
sample.txt


Replacing sed, grep, awk: Text Processing


Concept:

Pattern search, replace, and field extraction are the backbone of log analysis, config editing, and data wrangling.

Explanation:

Prints only lines containing the word 'Python' (grep-like functionality).

Use Case:

Search for errors in log files, filter data records, or validate that ETL steps completed.[grep]

Finds lines containing "Python" (substring search, like grep Python sample.txt).

[grep with regex]

Finds lines ending with "Processing" using a regular expression (like grep 'Processing$' sample.txt).

[sed]

Replaces "Shell" with "Terminal" in every line (sed 's/Shell/Terminal/g' sample.txt).

[sed with regex]

Uses regex to replace any word starting with Data (e.g., "DataA" → "Entry") (sed -E 's/Data\w*/Entry/g').

[awk]

Prints the second field from comma-separated lines (like awk -F, '{print $2}' sample.txt).

In [19]:
# Create a test file (simulates a log, CSV, or data dump)
with open('sample.txt', 'w') as f:
    f.write("Python Shell Automation Pythonic\n")
    f.write("Unix Shell Text Processing\n")
    f.write("Python is powerful\n")
    f.write("Field1,Field2,Field3\n")
    f.write("DataA,DataB,DataC\n")

# --- grep-like: Print lines containing 'Python'
print("\n[grep] Lines containing 'Python':")
with open('sample.txt') as f:
    for line in f:
        if 'Python' in line:
            print(line.strip())

# --- grep with regex: Print lines ending with 'Processing'
import re
print("\n[grep with regex] Lines ending with 'Processing':")
pattern = r'Processing$'
with open('sample.txt') as f:
    for line in f:
        if re.search(pattern, line):
            print(line.strip())

# --- sed-like: Replace 'Shell' with 'Terminal' in every line
print("\n[sed] Replace 'Shell' with 'Terminal' in each line:")
with open('sample.txt') as f:
    for line in f:
        print(line.replace('Shell', 'Terminal').strip())

# --- sed with regex: Replace any word starting with 'Data' with 'Entry'
print("\n[sed with regex] Replace words starting with 'Data' with 'Entry':")
pattern = r'Data\w*'
with open('sample.txt') as f:
    for line in f:
        print(re.sub(pattern, 'Entry', line).strip())

# --- awk-like: Print the second field (column) from comma-separated lines
print("\n[awk] Print the second field from CSV lines:")
with open('sample.txt') as f:
    for line in f:
        if ',' in line:
            fields = line.strip().split(',')
            if len(fields) > 1:
                print(fields[1])



[grep] Lines containing 'Python':
Python Shell Automation Pythonic
Python is powerful

[grep with regex] Lines ending with 'Processing':
Unix Shell Text Processing

[sed] Replace 'Shell' with 'Terminal' in each line:
Python Terminal Automation Pythonic
Unix Terminal Text Processing
Python is powerful
Field1,Field2,Field3
DataA,DataB,DataC

[sed with regex] Replace words starting with 'Data' with 'Entry':
Python Shell Automation Pythonic
Unix Shell Text Processing
Python is powerful
Field1,Field2,Field3
Entry,Entry,Entry

[awk] Print the second field from CSV lines:
Field2
DataB


Dealing with Exit Codes and IO

Concept:

Always check exit codes when automating—catch failures, trigger alerts, or take alternate actions.

Explanation:

Attempts to list a non-existent folder, shows exit code and error.

Use Case:

In robust batch jobs, use this to catch errors, log them, or send notifications when a process fails.

In [21]:
import subprocess

# Try to list a non-existent directory on Windows
proc = subprocess.run('dir non_existent_dir', capture_output=True, text=True, shell=True)
print("Exit code:", proc.returncode)
if proc.returncode != 0:
    print("Error output:", proc.stderr.strip())



Exit code: 1
Error output: File Not Found


Time and Date Utilities

Concept:

Time/date are crucial for logs, audit trails, timestamped filenames, or scheduling.

Explanation:

Prints human-readable time and full datetime.

Use Case:

Add timestamps to filenames, generate daily reports, or schedule cron-like tasks.

In [None]:
import time, datetime

print("Current time:", time.strftime('%Y-%m-%d %H:%M:%S'))
print("Datetime object:", datetime.datetime.now())


Shell Scripting in Python: Mini-Project

Concept:

Pull together everything learned to automate and log a full workflow.

Explanation:

Logs the current directory contents and timestamp to a dated file, then prints it out.

Use Case:

Daily ETL job log, automated system inventory, or dataset change tracking.

In [22]:
import os, datetime

log_file = f"log_{datetime.date.today()}.txt"
with open(log_file, 'w') as f:
    f.write("Files and folders:\n")
    for file in sorted(os.listdir('.')):
        f.write(f"{file}\n")
    f.write("\nDate completed: " + str(datetime.datetime.now()))

print("Wrote log file:", log_file)
print("Log file contents:")
with open(log_file) as f:
    for line in f:
        print(line.strip())


Wrote log file: log_2025-08-01.txt
Log file contents:
Files and folders:
4.01_Introduction_to_Pandas.ipynb
4.02_Pandas_DataFrame.ipynb
4.03_Date_and_TimeDelta_in_Pandas.ipynb
4.04_Working_with_Text_Data.ipynb
HousePrices.csv
Lesson_04_Knowledge_Checks.pptx
Python_Shell.01.08.ipynb
Weather_data.csv
housing_data.csv
log_2025-08-01.txt
sample.txt

Date completed: 2025-08-01 08:49:47.723522


Web Scraping with Python (Preview)

Concept:

Modern Python shell scripting extends to web scraping and HTTP automation.

Explanation:

Fetches a page and prints its <title>—a first step in web scraping.

Use Case:

Automate extraction of public datasets, monitor website updates, or pipeline scraped data directly into your data lake.

In [23]:
import requests
from bs4 import BeautifulSoup

url = 'https://example.com'
res = requests.get(url)
soup = BeautifulSoup(res.text, 'html.parser')
print("Page title:", soup.title.text)


Page title: Example Domain


## Detailed Explanations: Django Minimal Class-Based View Demo

---

**Cell 1: Project Setup (Terminal Commands)**

In this step, we install Django (if it’s not already present), create a new Django project called `mysite`, enter the project directory, and create a new application called `demo`.  
- `django-admin startproject mysite` initializes a new Django project with all the necessary settings and management files.
- `python manage.py startapp demo` creates a reusable application module for your Django project, where you’ll write your views and logic.

---

**Cell 2: Creating a Class-Based View (`demo/views.py`)**

Here, we define a new Python class called `HelloDjangoView` that inherits from Django’s built-in `View` class.  
- The `get` method is triggered for HTTP GET requests.  
- `return HttpResponse("Hello, Django!")` means any request to this view will return a simple text message as the HTTP response.  
- Class-based views are powerful because they allow you to organize logic and use built-in Django features for more complex web apps.

---

**Cell 3: Mapping the View to a URL (`demo/urls.py`)**

This file creates a URL mapping so Django knows which Python function or class to call for a given URL.
- We import our `HelloDjangoView` and map the URL path `'hello/'` to it using `HelloDjangoView.as_view()`.
- This means that any user visiting `/hello/` on the site will trigger this view and see our message.

---

**Cell 4: Including the App’s URLs in the Project (`mysite/urls.py`)**

Django projects can have many apps. The main `urls.py` in your project tells Django to route requests to each app’s own `urls.py`.
- We add `path('', include('demo.urls'))` to tell Django to include the URL patterns from our `demo` app at the root level.
- Now, `/hello/` (and any other demo app URLs you define) will work when you run your project.

---

**Cell 5: Running the Development Server**

We use `python manage.py runserver` to start Django’s built-in development web server.
- This allows you to test your site locally in a browser.
- When you go to [http://127.0.0.1:8000/hello/](http://127.0.0.1:8000/hello/), Django matches the URL to your `HelloDjangoView` and returns "Hello, Django!".

---

**Summary:**  
In these five steps, you created a working Django project, set up a minimal app, built and connected a class-based view, and started the server to serve your new web page.  
This demo illustrates the core flow of all Django projects: **routing a URL to a Python class that generates an HTTP response.**


In [None]:

!pip install django
!django-admin startproject mysite
!cd mysite && python manage.py startapp demo



In [None]:
from django.http import HttpResponse
from django.views import View

class HelloDjangoView(View):
    def get(self, request):
        return HttpResponse("Hello, Django!")


In [None]:
from django.urls import path
from .views import HelloDjangoView

urlpatterns = [
    path('hello/', HelloDjangoView.as_view(), name='hello_django'),
]


In [None]:
from django.contrib import admin
from django.urls import path, include

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', include('demo.urls')),
]


In [None]:
python manage.py runserver
