In PySpark, you can use multiple conditions in a filter() (or where()) clause by combining column expressions with logical operators such as & (and), | (or), and ~ (not).

Hereâ€™s how to modify your example to filter on two or more columns:

# Example: Select columns and filter rows using multiple conditions

In [8]:
from pyspark.sql import SparkSession
from pyspark.sql.functions import col

In [9]:
# Start Spark session
spark = SparkSession.builder.appName("FilterExample").getOrCreate()

In [10]:
# Sample data
data = [
    ("Alice", 30, "NY"),
    ("Bob", 25, "CA"),
    ("Charlie", 35, "NY"),
    ("David", 28, "TX"),
]

In [11]:
# Create DataFrame
df = spark.createDataFrame(data, ["Name", "Age", "State"])

In [12]:
df.show()

+-------+---+-----+
|   Name|Age|State|
+-------+---+-----+
|  Alice| 30|   NY|
|    Bob| 25|   CA|
|Charlie| 35|   NY|
|  David| 28|   TX|
+-------+---+-----+



In [13]:
# Select columns and filter with multiple conditions
filtered_df = df.select("Name", "Age", "State").filter(
    (col("Age") > 26) & (col("State") == "NY")
)

# Show result
filtered_df.show()


+-------+---+-----+
|   Name|Age|State|
+-------+---+-----+
|  Alice| 30|   NY|
|Charlie| 35|   NY|
+-------+---+-----+



### Explanation:
col("Age") > 26: checks if age is greater than 26.

col("State") == "NY": checks if state is NY.

& is the logical AND operator (wrap each condition in parentheses!).

### Other operators you can use:
|: OR

~: NOT

## Example using OR

In [7]:
df.filter((col("Age") > 30) | (col("State") == "TX")).show()

+-------+---+-----+
|   Name|Age|State|
+-------+---+-----+
|Charlie| 35|   NY|
|  David| 28|   TX|
+-------+---+-----+



## 8. Stop the SparkSession

In [14]:
spark.stop()

In [None]:
!python -v

import _frozen_importlib # frozen
import _imp # builtin
import '_thread' # <class '_frozen_importlib.BuiltinImporter'>
import '_weakref' # <class '_frozen_importlib.BuiltinImporter'>
import '_io' # <class '_frozen_importlib.BuiltinImporter'>
import 'marshal' # <class '_frozen_importlib.BuiltinImporter'>
import 'posix' # <class '_frozen_importlib.BuiltinImporter'>
import '_frozen_importlib_external' # <class '_frozen_importlib.FrozenImporter'>
# installing zipimport hook
import 'time' # <class '_frozen_importlib.BuiltinImporter'>
import 'zipimport' # <class '_frozen_importlib.FrozenImporter'>
# installed zipimport hook
# zipimport: found 10 names in '/usr/local/spark/python/lib/py4j-0.10.9.7-src.zip'
# /opt/conda/lib/python3.11/encodings/__pycache__/__init__.cpython-311.pyc matches /opt/conda/lib/python3.11/encodings/__init__.py
# code object from '/opt/conda/lib/python3.11/encodings/__pycache__/__init__.cpython-311.pyc'
import '_codecs' # <class '_frozen_importlib.BuiltinImporter'>
imp