Query the Name of any student in STUDENTS who scored higher than  Marks. Order your output by the last three characters of each name. If two or more students both have names ending in the same last three characters (i.e.: Bobby, Robby, etc.), secondary sort them by ascending ID.

Input Format

The STUDENTS table is described as follows:  

![Output](https://s3.amazonaws.com/hr-challenge-images/12896/1443815243-94b941f556-1.png)


The Name column only contains uppercase (A-Z) and lowercase (a-z) letters.

Sample Input

![Output](https://s3.amazonaws.com/hr-challenge-images/12896/1443815209-cf4b260993-2.png)


Sample Output

Ashley
Julia
Belvet
Explanation

Only Ashley, Julia, and Belvet have Marks > . If you look at the last three characters of each of their names, there are no duplicates and 'ley' < 'lia' < 'vet'.

In [0]:
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, IntegerType, StringType

# Initialize Spark session
spark = SparkSession.builder.appName("StudentRecords").getOrCreate()

# Define schema
schema = StructType([
    StructField("ID", IntegerType(), True),
    StructField("Name", StringType(), True),
    StructField("Marks", IntegerType(), True)
])

# Create DataFrame
data = [
    (1, "Ashley", 85),
    (2, "Julia", 90),
    (3, "Belvet", 88),
    (4, "Bobby", 75),
    (5, "Robby", 60),
    (6, "Charlie", 95),
    (7, "Andrew", 80)
]

df = spark.createDataFrame(data, schema=schema)

# Save the DataFrame as a temporary table
df.createOrReplaceTempView("STUDENTS")

# Save as a permanent table (if needed)
# df.write.mode("overwrite").saveAsTable("STUDENTS")

# Show the DataFrame
df.show()


+---+-------+-----+
| ID|   Name|Marks|
+---+-------+-----+
|  1| Ashley|   85|
|  2|  Julia|   90|
|  3| Belvet|   88|
|  4|  Bobby|   75|
|  5|  Robby|   60|
|  6|Charlie|   95|
|  7| Andrew|   80|
+---+-------+-----+



In [0]:
%sql
SELECT *
FROM STUDENTS
WHERE Marks > 75
ORDER BY RIGHT(Name,3)ASC, ID ASC

ID,Name,Marks
1,Ashley,85
2,Julia,90
6,Charlie,95
7,Andrew,80
3,Belvet,88


In [0]:
from pyspark.sql.functions import substring, length, asc,col

# Calculate the last 3 characters
df_with_last3 = df.withColumn("last_3_chars", substring(df["Name"], -3, 3))

# Show the results
result = df_with_last3.where('Marks > 75')
result.orderBy(["last_3_chars", "ID"], ascending=[True, True]).show()

+---+-------+-----+------------+
| ID|   Name|Marks|last_3_chars|
+---+-------+-----+------------+
|  1| Ashley|   85|         ley|
|  2|  Julia|   90|         lia|
|  6|Charlie|   95|         lie|
|  7| Andrew|   80|         rew|
|  3| Belvet|   88|         vet|
+---+-------+-----+------------+

