## Managing HDFS Directories

Now let us have a look at how to create directories and manage ownership.

* By default hdfs is superuser of HDFS
* `hadoop fs -mkdir` or `hdfs dfs -mkdir` – to create directories
* `hadoop fs -chown` or `hdfs dfs -chown` – to change ownership of files
* `chown` can also be used to change the group. We can change the group using `-chgrp` command as well. Make sure to run `-help` on chgrp and check the details.
* Here are the steps to create user space. Only users in HDFS group can take care of it.
  * Create directory with user id `itversity` under /user
  * Change ownership to the same name as the directory created earlier (/user/itversity)
  * You can validate permissions by using `hadoop fs -ls` or `hdfs dfs -ls` command on /user. Make sure to grep for the user name you are looking for.
* Let's go ahead and create user space in HDFS for `itversity`. I have to login as sudoer and run below commands.

```shell
sudo -u hdfs hdfs dfs -mkdir /user/itversity
sudo -u hdfs hdfs dfs -chown -R itversity:students /user/itversity
hdfs dfs -ls /user|grep itversity
```

* You should be able to create folders under your home directory.

In [2]:
%%sh

hdfs dfs -ls /user/${USER}

In [3]:
%%sh

hdfs dfs -mkdir /user/${USER}/retail_db

In [5]:
%%sh

hdfs dfs -ls /user/${USER}

Found 1 items
drwxr-xr-x   - itversity students          0 2021-01-07 18:53 /user/itversity/retail_db


* You can create the directory structure using `mkdir -p`. The existing folders will be ignored and non existing folders will be created.
  * Let us run `hdfs dfs -mkdir -p /user/${USER}/retail_db/orders/year=2020`.
  * As `/user/${USER}/retail_db` already exists, it will be ignored.
  * Both `/user/${USER}/retail_db/orders` as well as `/user/${USER}/retail_db/orders/year=2020` will be created.

In [7]:
%%sh

hdfs dfs -help mkdir

-mkdir [-p] <path> ... :
  Create a directory in specified location.
                                                  
  -p  Do not fail if the directory already exists 


In [10]:
%%sh

hdfs dfs -ls -R /user/${USER}/retail_db

In [11]:
%%sh

hdfs dfs -mkdir -p /user/${USER}/retail_db/orders/year=2020

In [12]:
%%sh

hdfs dfs -ls -R /user/${USER}/retail_db

drwxr-xr-x   - itversity students          0 2021-01-07 18:58 /user/itversity/retail_db/orders
drwxr-xr-x   - itversity students          0 2021-01-07 18:58 /user/itversity/retail_db/orders/year=2020


* We can delete non empty directory using `hdfs dfs -rm -R` and empty directory using `hdfs dfs -rmdir`. We will explore `hdfs dfs -rm` in detail later.

In [13]:
%%sh

hdfs dfs -help rmdir

-rmdir [--ignore-fail-on-non-empty] <dir> ... :
  Removes the directory entry specified by each directory argument, provided it is
  empty.


In [14]:
%%sh

hdfs dfs -rmdir /user/${USER}/retail_db/orders/year=2020

In [15]:
%%sh

hdfs dfs -rm /user/${USER}/retail_db

rm: `/user/itversity/retail_db': Is a directory


CalledProcessError: Command 'b'\nhdfs dfs -rm /user/itversity/retail_db\n'' returned non-zero exit status 1.

In [16]:
%%sh

hdfs dfs -rmdir /user/${USER}/retail_db

rmdir: `/user/itversity/retail_db': Directory is not empty


CalledProcessError: Command 'b'\nhdfs dfs -rmdir /user/itversity/retail_db\n'' returned non-zero exit status 1.

In [17]:
%%sh

hdfs dfs -rm -R /user/${USER}/retail_db

21/01/07 19:01:22 INFO fs.TrashPolicyDefault: Moved: 'hdfs://nn01.itversity.com:8020/user/itversity/retail_db' to trash at: hdfs://nn01.itversity.com:8020/user/itversity/.Trash/Current/user/itversity/retail_db


In [18]:
%%sh

hdfs dfs -ls /user/${USER}

Found 1 items
drwx------   - itversity students          0 2021-01-07 19:01 /user/itversity/.Trash
