# Dynamically generate Schemas from existing tables
You can dynamically load a `DataSet` and its corresponding `Schema` from an existing table. To illustrate this, let us first make a temporary table that we can load later.

In [1]:
from pyspark.sql import SparkSession
spark = SparkSession.Builder().config("spark.ui.showConsoleProgress", "false").getOrCreate()
spark.sparkContext.setLogLevel("ERROR")

In [2]:
import pandas as pd

(
    spark.createDataFrame(
        pd.DataFrame(
            dict(
                name=["Jack", "John", "Jane"],
                age=[20, 30, 40],
            )
        )
    )
    .createOrReplaceTempView("person_table")
)


We can now load these data using `load_table()`. Note that the `Schema` is inferred: it doesn't need to have been serialized using `typedspark`.

In [3]:
from typedspark import load_table

df, Person = load_table(spark, "person_table")

From here, it's trivial to generate the `Schema` for this table.

In [4]:
Person

To also generate documentation, run:

In [5]:
Person.print_schema(schema_name="Person", include_documentation=True)

from typing import Annotated

from pyspark.sql.types import LongType, StringType

from typedspark import Column, ColumnMeta, Schema


class Person(Schema):
    """Add documentation here."""

    name: Annotated[Column[StringType], ColumnMeta(comment="")]
    age: Annotated[Column[LongType], ColumnMeta(comment="")]



You can now use `df` and `Person` just like you would in your IDE. On Databricks, this also comes with an additional advantage: auto-complete on column names. No more looking at `df.columns` every minute to remember the column names!

In [6]:
(
    df
    .filter(Person.age > 25)
    .show()
)

+----+---+
|name|age|
+----+---+
|John| 30|
|Jane| 40|
+----+---+



Of note, `load_table()` automatically runs `register_schema_to_dataset()` on the resulting `DataSet` and `Schema`, hence resolving potential column disambiguities.