## Overriding Properties

Let us understand how we can override the properties while running `hdfs dfs` or `hadoop fs` commands.

* We can change any property which is not defined as final in **core-site.xml** or **hdfs-site.xml**.
* We can change `blocksize` as well as `replication` while copying the files. We can also change them after copying the files as well.
* We can either pass individual properties using `-D` or bunch of properties by passing xml similar to **core-site.xml** or **hdfs-site.xml** as part of `--conf`.
* Let's copy a file **/data/crime/csv/rows.csv** with default values. The file is splitted into 12 blocks with 2 copies each (as our default blocksize is 128 MB and replication factor is 2).

In [1]:
%%sh

hdfs dfs -ls /user/${USER}/crime

ls: `/user/itv002480/crime': No such file or directory


CalledProcessError: Command 'b'\nhdfs dfs -ls /user/${USER}/crime\n'' returned non-zero exit status 1.

In [2]:
%%sh

hdfs dfs -rm -R -skipTrash /user/${USER}/crime

rm: `/user/itv002480/crime': No such file or directory


CalledProcessError: Command 'b'\nhdfs dfs -rm -R -skipTrash /user/${USER}/crime\n'' returned non-zero exit status 1.

In [3]:
%%sh

hdfs dfs -mkdir -p /user/${USER}/crime/csv

In [4]:
%%sh

ls -lhtr /data/crime/csv

total 1.5G
-rwxr-xr-x 1 root root 1.5G Aug  8  2017 rows.csv


In [5]:
%%sh

hdfs dfs -put /data/crime/csv/rows.csv /user/${USER}/crime/csv

In [6]:
%%sh

hdfs dfs -stat %r /user/${USER}/crime/csv/rows.csv

3


In [8]:
%%sh

hdfs dfs -stat %o /user/${USER}/crime/csv/rows.csv

134217728


In [7]:
%%sh

hdfs dfs -stat %b /user/${USER}/crime/csv/rows.csv

1505540526


In [8]:
%%sh

hdfs dfs -rm -R -skipTrash /user/${USER}/crime/csv/rows.csv

Deleted /user/itv002480/crime/csv/rows.csv


In [9]:
%%sh

hdfs dfs -Ddfs.blocksize=64M -Ddfs.replication=3 -put /data/crime/csv/rows.csv /user/${USER}/crime/csv

In [13]:
%%sh

hdfs dfs -stat %r /user/${USER}/crime/csv/rows.csv

3


In [10]:
%%sh

hdfs dfs -stat %o /user/${USER}/crime/csv/rows.csv

67108864


In [11]:
%%sh

hdfs dfs -stat %b /user/${USER}/crime/csv/rows.csv

1505540526


In [12]:
%%sh

ls -ltr /etc/hadoop/conf/

total 196
-rwxr-xr-x 1 hdfs hadoop    10 Jul  6  2020 workers
-rwxr-xr-x 1 hdfs hadoop  2697 Jul  6  2020 ssl-server.xml.example
-rwxr-xr-x 1 hdfs hadoop  2316 Jul  6  2020 ssl-client.xml.example
drwxr-xr-x 2 hdfs hadoop  4096 Jul  6  2020 shellprofile.d
-rwxr-xr-x 1 hdfs hadoop  3414 Jul  6  2020 hadoop-user-functions.sh.example
-rwxr-xr-x 1 hdfs hadoop 11765 Jul  6  2020 hadoop-policy.xml
-rwxr-xr-x 1 hdfs hadoop  3321 Jul  6  2020 hadoop-metrics2.properties
-rwxr-xr-x 1 hdfs hadoop  3999 Jul  6  2020 hadoop-env.cmd
-rwxr-xr-x 1 hdfs hadoop   682 Jul  6  2020 kms-site.xml
-rwxr-xr-x 1 hdfs hadoop  1860 Jul  6  2020 kms-log4j.properties
-rwxr-xr-x 1 hdfs hadoop  1351 Jul  6  2020 kms-env.sh
-rwxr-xr-x 1 hdfs hadoop  3518 Jul  6  2020 kms-acls.xml
-rwxr-xr-x 1 hdfs hadoop  2681 Jul  6  2020 user_ec_policies.xml.template
-rwxr-xr-x 1 hdfs hadoop   620 Jul  6  2020 httpfs-site.xml
-rwxr-xr-x 1 hdfs hadoop  1657 Jul  6  2020 httpfs-log4j.properties
-rwxr-xr-x 1 hdfs hadoop  1484 Jul  6  2

In [13]:
%%sh

cat /etc/hadoop/conf/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/data1/hadoop/hadoop/dfs/nn/,/data2/hadoop/hadoop/dfs/nn/</value>
    </property>
    <property>
        <name>dfs.namenode.checkpoint.dir</name>
        <value>/data1/ha