# Module Feedback

This system records the responses of students on their learning experience at university.

![rel](https://sqlzoo.net/w/images/7/71/Module-feedback.png)

In [1]:
import org.apache.log4j.{Level, Logger}
Logger.getLogger("org").setLevel(Level.OFF)

import $ivy.`org.apache.spark::spark-sql:2.4.0`
import org.apache.spark.sql._
import org.apache.spark.sql.functions._

val spark = {
    NotebookSparkSession.builder()
    .progress(false)
    .appName("app11")
    .master("local[*]")
    .config("spark.sql.warehouse.dir", "hdfs://quickstart.cloudera:8020/user/hive/warehouse")
    .config("hive.metastore.uris", "thrift://quickstart.cloudera:9083")
    .config("spark.sql.catalogImplementation", "hive")
    .config("spark.sql.repl.eagerEval.enabled", "True")
    .getOrCreate()
}

import spark.implicits._

Loading spark-stubs, spark-hive
Creating SparkSession


Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties


[32mimport [39m[36morg.apache.log4j.{Level, Logger}
[39m
[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36morg.apache.spark.sql._
[39m
[32mimport [39m[36morg.apache.spark.sql.functions._

[39m
[36mspark[39m: [32mSparkSession[39m = org.apache.spark.sql.SparkSession@690e0b61
[32mimport [39m[36mspark.implicits._[39m

In [2]:
def sc = spark.sparkContext
val hiveCxt = new org.apache.spark.sql.hive.HiveContext(sc)

defined [32mfunction[39m [36msc[39m
[36mhiveCont[39m: [32morg[39m.[32mapache[39m.[32mspark[39m.[32msql[39m.[32mhive[39m.[32mHiveContext[39m = org.apache.spark.sql.hive.HiveContext@12fff33a

In [3]:
// extend the DataFrame class to prettify the output of show()
implicit class RichDF(val ds:DataFrame) {
    def showHTML(limit: Int = -1, truncate: Int = 0) = {
        import xml.Utility.escape
        
        val data = if (limit < 0) ds.take(255) else ds.take(limit)
        val header = ds.schema.fieldNames.toSeq        
        val rows: Seq[Seq[String]] = data.map { row =>
          row.toSeq.map { cell =>
            val str = cell match {
              case null => "null"
              case binary: Array[Byte] => binary.map("%02X".format(_)).mkString("[", " ", "]")
              case array: Array[_] => array.mkString("[", ", ", "]")
              case seq: Seq[_] => seq.mkString("[", ", ", "]")
              case _ => cell.toString
            }
            if (truncate > 0 && str.length > truncate) {
              // do not show ellipses for strings shorter than 4 characters.
              if (truncate < 4) str.substring(0, truncate)
              else str.substring(0, truncate - 3) + "..."
            } else {
              str
            }
          }: Seq[String]
        }

        publish.html(s""" <table>
                <tr>
                 ${header.map(h => s"<th>${escape(h)}</th>").mkString}
                </tr>
                ${rows.map { row =>
                  s"<tr>${row.map{c => s"<td>${escape(c)}</td>" }.mkString}</tr>"
                }.mkString}
            </table>
        """)        
    }
}

defined [32mclass[39m [36mRichDF[39m

## 1. Find the student name from a matriculation number

**Find the name of the student with number 50200100**

In [4]:
val insspr = hiveCxt.table("sqlzoo.INS_SPR")
(insspr.filter($"spr_code" === "50200100")
 .select($"spr_fnm1", $"spr_surn")
 .showHTML())

20/07/05 10:37:01 INFO metastore: Trying to connect to metastore with URI thrift://quickstart.cloudera:9083
20/07/05 10:37:02 INFO metastore: Connected to metastore.


spr_fnm1,spr_surn
Tom,Cotton


[36minsspr[39m: [32mDataFrame[39m = [spr_code: string, spr_fnm1: string ... 1 more field]

## 2. Find the modules studied by a student

**Show the module code and module name for modules studied by the student with number 50200100 in session 2016/7 TR1**

In [5]:
val insmod = hiveCxt.table("sqlzoo.INS_MOD")
val camsmo = hiveCxt.table("sqlzoo.CAM_SMO")
(insmod.join(camsmo, insmod("MOD_CODE")===camsmo("MOD_CODE"))
 .filter($"SPR_CODE"==="50200100" && $"AYR_CODE"==="2016/7" && $"PSL_CODE"==="TR1")
 .select(camsmo("MOD_CODE"), insmod("MOD_NAME"))
 .showHTML())

MOD_CODE,MOD_NAME
CSN08101,Systems and Services
INF08104,Database Systems
SET08108,Software Development 2


[36minsmod[39m: [32mDataFrame[39m = [mod_code: string, mod_name: string ... 1 more field]
[36mcamsmo[39m: [32mDataFrame[39m = [spr_code: string, mod_code: string ... 2 more fields]

## 3. Find the modules and module leader studied by a student

**Show the module code and module name and details of the module leader for modules studied by the student with number 50200100 in session 2016/7 TR1**

In [6]:
val insprs = hiveCxt.table("sqlzoo.INS_PRS")
(camsmo.filter($"SPR_CODE"==="50200100" && $"AYR_CODE"==="2016/7" && 
               $"PSL_CODE"==="TR1")
 .join(insmod, camsmo("MOD_CODE")===insmod("MOD_CODE"))
 .join(insprs, insmod("PRS_CODE")===insprs("PRS_CODE"))
 .select(camsmo("MOD_CODE"), insmod("MOD_NAME"), insprs("PRS_CODE"),
         insprs("PRS_FNM1"), insprs("PRS_SURN"))
 .showHTML())

MOD_CODE,MOD_NAME,PRS_CODE,PRS_FNM1,PRS_SURN
CSN08101,Systems and Services,40000008,James,Jackson
INF08104,Database Systems,40000036,Andrew,Cumming
SET08108,Software Development 2,40000408,Neil,Urquhart


[36minsprs[39m: [32mDataFrame[39m = [prs_code: string, prs_fnm1: string ... 1 more field]

## 4. Show the scores for module SET08108

**Show the Percentage of students who gave 4 or 5 to module SET08108 in session 2016/7 TR1**

(note that this is not real data, these responses were randomly generated)

In [7]:
val insres = hiveCxt.table("sqlzoo.INS_RES")
val insque = hiveCxt.table("sqlzoo.INS_QUE")
val inscat = hiveCxt.table("sqlzoo.INS_CAT")
(insres.filter($"MOD_CODE"==="SET08108" && $"AYR_CODE"==="2016/7" &&
              $"PSL_CODE"==="TR1")
 .join(insque, insres("QUE_CODE")===insque("QUE_CODE"))
 .join(inscat, insque("CAT_CODE")===inscat("CAT_CODE"))
 .withColumn("VALU", floor(insres("RES_VALU")/4))
 .select(insres("QUE_CODE"), insque("QUE_TEXT"), inscat("CAT_NAME"), $"VALU")
 .groupBy("QUE_CODE", "QUE_TEXT", "CAT_NAME")
 .agg(round(sum("VALU")*100/count("VALU"), 0).as("score"))
 .orderBy("QUE_CODE")
 .showHTML())

QUE_CODE,QUE_TEXT,CAT_NAME,score
1.1,Staff are good at explaining things.,Learning and Teaching,89.0
1.2,Staff made the subject interesting.,Learning and Teaching,82.0
1.3,The module was intellectually stimulating.,Learning and Teaching,82.0
1.4,The aims and objectives were clearly stated.,Learning and Teaching,89.0
1.5,The module was well-organised and ran smoothly.,Learning and Teaching,78.0
1.6,The pace was appropriate.,Learning and Teaching,80.0
1.7,The level was appropriate.,Learning and Teaching,82.0
1.8,The workload was managable.,Learning and Teaching,78.0
1.9,I was able to contact module staff when I needed to.,Learning and Teaching,76.0
2.1,The assessment requirements were clearly stated.,Assessment and Feedback,84.0


[36minsres[39m: [32mDataFrame[39m = [spr_code: string, mod_code: string ... 4 more fields]
[36minsque[39m: [32mDataFrame[39m = [que_code: string, cat_code: string ... 2 more fields]
[36minscat[39m: [32mDataFrame[39m = [cat_code: string, cat_name: string ... 1 more field]

## 5. Show the frequency chart for module SET08108 for question 4.1

**For each response 1-5 show the number of students who gave that response (Module SET08108, 2016/7, TR1)**

(note that this is not real data, these responses were randomly generated)

In [8]:
(insres.filter($"MOD_CODE"==="SET08108" && $"AYR_CODE"==="2016/7" &&
              $"PSL_CODE"==="TR1" && $"QUE_CODE"==="4.1")
 .select("MOD_CODE", "RES_VALU", "SPR_CODE")
 .groupBy("MOD_CODE", "RES_VALU")
 .count()
 .showHTML())

MOD_CODE,RES_VALU,count
SET08108,2,6
SET08108,5,39
SET08108,4,10


In [9]:
spark.stop()