### element_at

- used to **extract** specific elements from **array or map** columns within a DataFrame.
- Works with both **ArrayType and MapType columns**.

#### 1) element_at() on an array()

- For **arrays**, indexing **starts at 1**;
  - **Positive indices** access elements from the **beginning** (e.g., **index 1** for the **first element**).
  - If the **index is negative**, elements are accessed from the **end of the array towards the beginning**. (e.g., **index -1** for the **last element**).
  - Using **index 0** will result in an **error**.
- If the **index is out of array boundaries**
  - **spark.sql.ansi.enabled** is **true**, an exception is thrown.
  - otherwise, **NULL** is returned.

#### Syntax

     element_at(array_col, index)

In [0]:
from pyspark.sql.functions import element_at

data = [('Jay', ['Java','Scala','Python','PySpark',None], {'product':'bmw', 'color':'brown', 'type':'sedan'}),
        ('Midun', ['Spark','Java','Spark SQL','SQL',None], {'product':'audi', 'color':None, 'type':'truck'}),
        ('Robert', ['CSharp','',None,'VS Code'], {'product':'volvo', 'color':'', 'type':'sedan'}),
        ('Paul', None, None),
        ('Basha', ['1','2','3','4','5'], {}),
        ('Gopesh', ['Python','R','SQL','Tableau'], {'product':'tesla', 'color':'red', 'type':'electric'}),
        ('Bobby', ['Go','Rust','C++','Docker'], {'product':'ford', 'color':'blue', 'type':'suv'}),
        ('Chetan', ['JavaScript','React','NodeJS','MongoDB'], {'product':'toyota', 'color':'white', 'type':'hatchback'}),
        ('Dravid', ['Python','Java','AWS','Spark'], {'product':'honda', 'color':'black', 'type':'sedan'}),
        ('Eshwar', ['C','C++','Embedded','Matlab'], {'product':'nissan', 'color':'grey', 'type':'truck'}),
        ('Firoz', ['Scala','Spark','Kafka'], {'product':'mercedes', 'color':'silver', 'type':'suv'}),
        ('Gayathri', ['Java','Spring Boot','Hibernate'], {'product':'bmw', 'color':'blue', 'type':'coupe'}),
        ('Hemanth', ['Python','Pandas','Numpy','ML'], {'product':'audi', 'color':'green', 'type':'sedan'}),
        ('Ishan', ['Ruby','Rails','Postgres'], {'product':'volkswagen', 'color':'yellow', 'type':'hatchback'}),
        ('Jack', ['Kotlin','Android','Firebase'], {'product':'hyundai', 'color':'white', 'type':'suv'})
       ]

columns = ['name','knownLanguages','properties']

df_array = spark.createDataFrame(data, columns)
display(df_array)

name,knownLanguages,properties
Jay,"List(Java, Scala, Python, PySpark, null)","Map(product -> bmw, color -> brown, type -> sedan)"
Midun,"List(Spark, Java, Spark SQL, SQL, null)","Map(product -> audi, color -> null, type -> truck)"
Robert,"List(CSharp, , null, VS Code)","Map(product -> volvo, color -> , type -> sedan)"
Paul,,
Basha,"List(1, 2, 3, 4, 5)",Map()
Gopesh,"List(Python, R, SQL, Tableau)","Map(product -> tesla, color -> red, type -> electric)"
Bobby,"List(Go, Rust, C++, Docker)","Map(product -> ford, color -> blue, type -> suv)"
Chetan,"List(JavaScript, React, NodeJS, MongoDB)","Map(product -> toyota, color -> white, type -> hatchback)"
Dravid,"List(Python, Java, AWS, Spark)","Map(product -> honda, color -> black, type -> sedan)"
Eshwar,"List(C, C++, Embedded, Matlab)","Map(product -> nissan, color -> grey, type -> truck)"


In [0]:
df_array.select(
    "name",
    element_at("knownLanguages", 1).alias("first_element"),
    element_at("knownLanguages", 2).alias("second_element"),
    element_at("knownLanguages", 3).alias("third_element"),
    element_at("knownLanguages", -1).alias("last_element")
).display()

name,first_element,second_element,third_element,last_element
Jay,Java,Scala,Python,
Midun,Spark,Java,Spark SQL,
Robert,CSharp,,,VS Code
Paul,,,,
Basha,1,2,3,5
Gopesh,Python,R,SQL,Tableau
Bobby,Go,Rust,C++,Docker
Chetan,JavaScript,React,NodeJS,MongoDB
Dravid,Python,Java,AWS,Spark
Eshwar,C,C++,Embedded,Matlab


##### Get Element At
- **element_at()** also works with **dynamic indexes or keys** passed as column **expressions or literal values**.
- For example, you can use **lit(-1)** to **dynamically retrieve** the **last element** of an **array**.

In [0]:
# Retrieve the last element of an array using lit() 
from pyspark.sql.functions import lit

df_array.select("knownLanguages",
                element_at("knownLanguages", lit(1)).alias("first_element"),
                element_at("knownLanguages", lit(2)).alias("second_element"),
                element_at("knownLanguages", lit(-1)).alias("last_element")
                ).display()

knownLanguages,first_element,second_element,last_element
"List(Java, Scala, Python, PySpark, null)",Java,Scala,
"List(Spark, Java, Spark SQL, SQL, null)",Spark,Java,
"List(CSharp, , null, VS Code)",CSharp,,VS Code
,,,
"List(1, 2, 3, 4, 5)",1,2,5
"List(Python, R, SQL, Tableau)",Python,R,Tableau
"List(Go, Rust, C++, Docker)",Go,Rust,Docker
"List(JavaScript, React, NodeJS, MongoDB)",JavaScript,React,MongoDB
"List(Python, Java, AWS, Spark)",Python,Java,Spark
"List(C, C++, Embedded, Matlab)",C,C++,Matlab


#### 2) element_at() on an map

- It returns the value associated with a given key.
- If the **key** is **not found** in the **map**, it returns **NULL**.

#### Syntax

     element_at(map_col, key)
- Returns the **value** for the given key.

In [0]:
df_array.select(
    "properties",
    element_at("properties", "product").alias("Product"),
    element_at("properties", "color").alias("Color"),
    element_at("properties", "type").alias("Type")
).display()

properties,Product,Color,Type
"Map(product -> bmw, color -> brown, type -> sedan)",bmw,brown,sedan
"Map(product -> audi, color -> null, type -> truck)",audi,,truck
"Map(product -> volvo, color -> , type -> sedan)",volvo,,sedan
,,,
Map(),,,
"Map(product -> tesla, color -> red, type -> electric)",tesla,red,electric
"Map(product -> ford, color -> blue, type -> suv)",ford,blue,suv
"Map(product -> toyota, color -> white, type -> hatchback)",toyota,white,hatchback
"Map(product -> honda, color -> black, type -> sedan)",honda,black,sedan
"Map(product -> nissan, color -> grey, type -> truck)",nissan,grey,truck


In [0]:
df_array.select(
    "properties",
    element_at("properties", lit("product")).alias("Product"),
    element_at("properties", lit("color")).alias("Color"),
    element_at("properties", lit("type")).alias("Type"),
    element_at("properties", lit("height")).alias("height")
).display()

properties,Product,Color,Type,height
"Map(product -> bmw, color -> brown, type -> sedan)",bmw,brown,sedan,
"Map(product -> audi, color -> null, type -> truck)",audi,,truck,
"Map(product -> volvo, color -> , type -> sedan)",volvo,,sedan,
,,,,
Map(),,,,
"Map(product -> tesla, color -> red, type -> electric)",tesla,red,electric,
"Map(product -> ford, color -> blue, type -> suv)",ford,blue,suv,
"Map(product -> toyota, color -> white, type -> hatchback)",toyota,white,hatchback,
"Map(product -> honda, color -> black, type -> sedan)",honda,black,sedan,
"Map(product -> nissan, color -> grey, type -> truck)",nissan,grey,truck,


**Get a Non-Existing Value from a Map using a Key**
- If you want to retrieve a value for a key that **doesn’t exist** in the **map**, element_at() function will return **NULL** instead of throwing an **error**.

In [0]:
# Get a Non-Existing Value from a Map using a Key
df_array.select(
    "properties",
    element_at("properties", "height").alias("Non_Existing")
).display()

properties,Non_Existing
"Map(product -> bmw, color -> brown, type -> sedan)",
"Map(product -> audi, color -> null, type -> truck)",
"Map(product -> volvo, color -> , type -> sedan)",
,
Map(),
"Map(product -> tesla, color -> red, type -> electric)",
"Map(product -> ford, color -> blue, type -> suv)",
"Map(product -> toyota, color -> white, type -> hatchback)",
"Map(product -> honda, color -> black, type -> sedan)",
"Map(product -> nissan, color -> grey, type -> truck)",


#### 3) element_at() in Array and Map Together

In [0]:
# Implement both array and map columns uisng element_at()
df_array.select("knownLanguages", "properties",
                element_at("knownLanguages", 1).alias("first_element"),
                element_at("knownLanguages", -1).alias("last_element"),
                element_at("properties", "product").alias("Product"),
                element_at("properties", "color").alias("Color")
                ).display()

knownLanguages,properties,first_element,last_element,Product,Color
"List(Java, Scala, Python, PySpark, null)","Map(product -> bmw, color -> brown, type -> sedan)",Java,,bmw,brown
"List(Spark, Java, Spark SQL, SQL, null)","Map(product -> audi, color -> null, type -> truck)",Spark,,audi,
"List(CSharp, , null, VS Code)","Map(product -> volvo, color -> , type -> sedan)",CSharp,VS Code,volvo,
,,,,,
"List(1, 2, 3, 4, 5)",Map(),1,5,,
"List(Python, R, SQL, Tableau)","Map(product -> tesla, color -> red, type -> electric)",Python,Tableau,tesla,red
"List(Go, Rust, C++, Docker)","Map(product -> ford, color -> blue, type -> suv)",Go,Docker,ford,blue
"List(JavaScript, React, NodeJS, MongoDB)","Map(product -> toyota, color -> white, type -> hatchback)",JavaScript,MongoDB,toyota,white
"List(Python, Java, AWS, Spark)","Map(product -> honda, color -> black, type -> sedan)",Python,Spark,honda,black
"List(C, C++, Embedded, Matlab)","Map(product -> nissan, color -> grey, type -> truck)",C,Matlab,nissan,grey


#### 4) Array of Maps

In [0]:
data = [
    (1, [{"k": "a", "v": "100"}, {"k": "b", "v": "200"}]),
    (2, [{"k": "x", "v": "111"}, {"k": "y", "v": "222"}]),
    (3, [{"k": "c", "v": "120"}, {"k": "d", "v": "300"}]),
    (4, [{"k": "e", "v": "121"}, {"k": "f", "v": "332"}]),
    (5, [{"k": "g", "v": "130"}, {"k": "h", "v": "400"}]),
    (6, [{"k": "i", "v": "161"}, {"k": "j", "v": "442"}]),
    (7, [{"k": "k", "v": "150"}, {"k": "l", "v": "500"}]),
    (8, [{"k": "m", "v": "191"}, {"k": "n", "v": "662"}])
]

df_arr_map = spark.createDataFrame(data, ["id", "items"])
display(df_arr_map)

id,items
1,"List(Map(k -> a, v -> 100), Map(k -> b, v -> 200))"
2,"List(Map(k -> x, v -> 111), Map(k -> y, v -> 222))"
3,"List(Map(k -> c, v -> 120), Map(k -> d, v -> 300))"
4,"List(Map(k -> e, v -> 121), Map(k -> f, v -> 332))"
5,"List(Map(k -> g, v -> 130), Map(k -> h, v -> 400))"
6,"List(Map(k -> i, v -> 161), Map(k -> j, v -> 442))"
7,"List(Map(k -> k, v -> 150), Map(k -> l, v -> 500))"
8,"List(Map(k -> m, v -> 191), Map(k -> n, v -> 662))"


In [0]:
df_arr_map.select(
    "items",
    element_at("items", 1).alias("first_item"),
    element_at("items", -1).alias("last_item")
).display()

items,first_item,last_item
"List(Map(k -> a, v -> 100), Map(k -> b, v -> 200))","Map(k -> a, v -> 100)","Map(k -> b, v -> 200)"
"List(Map(k -> x, v -> 111), Map(k -> y, v -> 222))","Map(k -> x, v -> 111)","Map(k -> y, v -> 222)"
"List(Map(k -> c, v -> 120), Map(k -> d, v -> 300))","Map(k -> c, v -> 120)","Map(k -> d, v -> 300)"
"List(Map(k -> e, v -> 121), Map(k -> f, v -> 332))","Map(k -> e, v -> 121)","Map(k -> f, v -> 332)"
"List(Map(k -> g, v -> 130), Map(k -> h, v -> 400))","Map(k -> g, v -> 130)","Map(k -> h, v -> 400)"
"List(Map(k -> i, v -> 161), Map(k -> j, v -> 442))","Map(k -> i, v -> 161)","Map(k -> j, v -> 442)"
"List(Map(k -> k, v -> 150), Map(k -> l, v -> 500))","Map(k -> k, v -> 150)","Map(k -> l, v -> 500)"
"List(Map(k -> m, v -> 191), Map(k -> n, v -> 662))","Map(k -> m, v -> 191)","Map(k -> n, v -> 662)"
