# sortBy():
            Syntax:  RDD.sortBy(<keyfunc>, ascending=True, numPartitions=None)

The sortBy() transformation sorts an RDD by the <keyfunc> argument (a named or anonymous function) that nominates the key for a given dataset. It sorts according to the sort order
of the key object type. For instance, int and double data types are sorted numerically, whereas
String types are sorted in lexicographical order.
    
    
The ascending argument is a Boolean argument that defaults to True and specifies the sort order
to be used. A descending sort order is specified by setting ascending=False.
    
An example of the sortBy() function is shown in Listing 4.16.

In [1]:
from pyspark.context import SparkContext
from pyspark.sql.session import SparkSession

In [2]:
sc = SparkContext('local')
spark = SparkSession(sc)

In [3]:
readme = sc.textFile('file:///opt/Spark/README.md')
words = readme.flatMap(lambda x: x.split(' ')).filter(lambda x: len(x) > 0)

In [8]:
sortByFirstLetter = words.sortBy(lambda x:x[0].lower() , ascending=False)
sortByFirstLetter.take(40)

['You',
 'you',
 'you',
 'you',
 'You',
 'YARN,',
 'You',
 'you',
 'your',
 'YARN"](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn)',
 'web',
 'way',
 'which',
 'which',
 'with',
 'will',
 'when',
 'with',
 'with',
 'variable',
 'Versions',
 'versions',
 'version',
 'Version',
 'using',
 'using',
 'using',
 'use',
 'use',
 'URL,',
 'use',
 'usage',
 'using:',
 'uses',
 'that',
 'tools',
 'the',
 'the',
 'This',
 'To']