Skip to content
Fetching contributors…
Cannot retrieve contributors at this time
366 lines (248 sloc) 11.7 KB
.. _amazon-ec2:
==========
Amazon EC2
==========
.. default-domain:: mongodb
.. contents:: On this page
:local:
:backlinks: none
:depth: 1
:class: twocols
MongoDB runs well on `Amazon EC2 <http://aws.amazon.com/ec2/>`_. To
deploy MongoDB on EC2, you can either set up a new instance manually
(refer to :ref:`deploy-mongodb-ec2`) or :mms-docs:`provision machines
on EC2 </tutorial/provision-aws-servers/>` using :mms-docs:`Automation
</tutorial/nav/automation/>`. Use of Amazon EC2 is subject to Amazon's
pricing policy.
.. _storage-considerations:
Storage Considerations
----------------------
EC2 instances can be configured with either ephemeral storage or
persistent storage using the Elastic Block Store (EBS). Ephemeral
storage is lost when instances are terminated, so it is generally not
recommended unless you understand the data loss implications.
For almost all deployments EBS will be the better choice. For
production systems we recommend using
* EBS-optimized EC2 instances
* Provisioned IOPS (PIOPS) EBS volumes
Storage configuration needs vary among deployments, but for best
performance we recommend separate volumes for *data files*, the
*journal*, and the *log*. Each has different write behavior, and
placing them on separate volumes reduces I/O contention.
.. note::
Using different storage devices will affect your ability to create
snapshot-style backups of your data, since the files will be on
different devices and volumes.
Using RAID levels such as RAID0, RAID1, or RAID10 can provide
volume-level redundancy or capacity. Different storage configurations
will have different cost implications, especially when combined with
PIOPS EBS volumes.
.. _deploy-mongodb-ec2:
Manually Deploy MongoDB on EC2
------------------------------
The following steps can be used to deploy MongoDB on EC2 yourself. The
instances will be configured with the following characteristics:
* Amazon Linux
* MongoDB installed via ``yum``
* Individual PIOPS EBS volumes for data (1000 IOPS), journal (250 IOPS), and log (100 IOPS)
* Updated read-ahead values for each block device
* Update ulimit settings
Before continuing be sure to have the following:
* Install `EC2 command line tools <http://aws.amazon.com/developertools/351>`_
* `Generate an EC2 key pair <http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/generating-a-keypair.html>`_ for connecting to the instance via SSH
* `Create a security group <http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html#creating-security-group>`_ that allows SSH connections
Create the instance using the key pair and security group previously
created and also include the :setting:`--ebs-optimized` flag and
specify individual PIOPS EBS volumes (:file:`/dev/xvdf` for data,
:file:`/dev/xvdg` for journal, :file:`/dev/xvdh` for log). Refer to the
documentation for `ec2-run-instances
<http://docs.aws.amazon.com/AWSEC2/latest/CommandLineReference/ApiReference-cmd-RunInstances.html>`_
for more information on devices and parameters.:
.. code-block:: sh
$ ec2-run-instances ami-05355a6c -t m1.large -g [SECURITY-GROUP] -k [KEY-PAIR] -b "/dev/xvdf=:200:false:io1:1000" -b "/dev/xvdg=:25:false:io1:250" -b "/dev/xvdh=:10:false:io1:100" --ebs-optimized true
You can use the instance-id returned to ascertain the IP Address or DNS information for the instance:
.. code-block:: sh
$ ec2-describe-instances [INSTANCE-ID]
Now SSH into the instance:
.. code-block:: sh
$ ssh -i /path/to/keypair.pem ec2-user@ec2-1-2-3-4.amazonaws.com
After login, update installed packages, add the MongoDB yum repo, and install MongoDB:
.. code-block:: sh
$ sudo yum -y update
$ echo "[MongoDB]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64
gpgcheck=0
enabled=1" | sudo tee -a /etc/yum/repos.d/mongodb.repo
$ sudo yum install -y mongodb-org-server mongodb-org-shell mongodb-org-tools
Next, create/configure the mount points, mount each volume, set ownership (MongoDB runs under the :setting:`mongod` user/group), and set the :file:`/journal` link:
.. code-block:: sh
$ sudo mkdir /data /log /journal
$ sudo mkfs.ext4 /dev/xvdf
$ sudo mkfs.ext4 /dev/xvdg
$ sudo mkfs.ext4 /dev/xvdh
$ echo '/dev/xvdf /data ext4 defaults,auto,noatime,noexec 0 0
/dev/xvdg /journal ext4 defaults,auto,noatime,noexec 0 0
/dev/xvdh /log ext4 defaults,auto,noatime,noexec 0 0' | sudo tee -a /etc/fstab
$ sudo mount /data
$ sudo mount /journal
$ sudo mount /log
$ sudo chown mongod:mongod /data /journal /log
$ sudo ln -s /journal /data/journal
Now configure the following MongoDB parameters by editing the configuration file :file:`/etc/mongod.conf`:
.. code-block:: sh
dbpath = /data
logpath = /log/mongod.log
Optionally, if you don't want MongoDB to start at boot you can issue the following command:
.. code-block:: sh
$ sudo chkconfig mongod off
By default Amazon Linux uses :program:`ulimit` settings that are not
appropriate for MongoDB. To setup :program:`ulimit` to match the
documented :manual:`ulimit settings </reference/ulimit>` use the
following steps:
.. code-block:: sh
$ sudo nano /etc/security/limits.conf
* soft nofile 64000
* hard nofile 64000
* soft nproc 32000
* hard nproc 32000
$ sudo nano /etc/security/limits.d/90-nproc.conf
* soft nproc 32000
* hard nproc 32000
Additionally, default read ahead settings on EC2 are not optimized for
MongoDB. As noted in the read-ahead settings from :manual:`Production
Notes </administration/production-notes>`, the settings should be
adjusted to read approximately 32 blocks (or 16 KB) of data. The
following command will set the readahead appropriately (repeat for
necessary volumes):
.. code-block:: sh
$ sudo blockdev --setra 32 /dev/xvdf
To make this change persistent across system boot you can issue the following command:
.. code-block:: sh
$ echo 'ACTION=="add", KERNEL=="xvdf", ATTR{bdi/read_ahead_kb}="16"' | sudo tee -a /etc/udev/rules.d/85-ebs.rules
Once again, repeat the above command for all required volumes (note:
the device we created was named :file:`/dev/xvdf` but the name used by
the system is :file:`xvdf`).
To start :program:`mongod`, issue the following command:
.. code-block:: sh
$ sudo service mongod start
Then connect to the MongoDB instance using the :program:`mongo` shell:
.. code-block:: sh
$ mongo
MongoDB shell version: 2.4.8
connecting to: test
>
To have MongoDB startup automatically at boot issue the following command:
.. code-block:: sh
$ sudo chkconfig mongod on
For production deployments consider using :manual:`Replica Sets
</replication/>` or :manual:`Sharding </sharding/>`.
Backup, Restore, Verify
-----------------------
Depending upon the configuration of your EC2 instances, there are a
number of ways to conduct regular backups of your data. For specific
instructions on backing up, restoring and verifying refer to
:ref:`ec2-backup-and-restore`.
Deployment Notes
----------------
Instance Types
~~~~~~~~~~~~~~
MongoDB works on most EC2 types including Linux and Windows. We
recommend you use a 64 bit instance as this is
`required for all MongoDB databases of significant size
<http://blog.mongodb.org/post/137788967/32-bit-limitations>`_.
Additionally, we find that the larger instances tend to be on the
freshest ec2 hardware.
Running MongoDB
```````````````
Before starting a :program:`mongod` instance, decide where to put your
data files. Run :program:`df -h` to see a list of available
volumes. On some images, :file:`/mnt` will be the locally-attached
storage volume. Alternatively, you may want to use `Elastic Block Store
<http://aws.amazon.com/ebs/>`_ which will have a different mount
point.
Mounting the filesystem with the :option:`noatime <mount -o noatime>`
attribute can improve disk performance. For example:
.. code-block:: sh
/dev/mapper/my_vol /var/lib/mongodb xfs noatime,noexec 0 0
Create the mongodb data file directory in the desired location and then
start the :program:`mongod` server:
.. code-block:: sh
mkdir /mnt/db
./mongod --fork --logpath ~/mongod.log --dbpath /mnt/db/
Operating System
````````````````
Occasionally, due to the shared and virtualized nature of EC2, an
instance can experience intermittent I/O problems and low responsiveness
compared to other similar instances. Terminating the instance and
bringing up a new one can in some cases result in better performance.
Some people have reported problems with ubuntu 10.04 on ec2.
Please read
`Ubuntu issue 614853 <https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/614853>`_
and
`Linux Kernel issue 16991 <https://bugzilla.kernel.org/show_bug.cgi?id=16991>`_ for further
information.
Networking
~~~~~~~~~~
Port Management
```````````````
By default the database will now be listening on port 27017. The web
administrative UI will be on port 28017.
Keepalive
`````````
Change the default TCP keepalive time to 300 seconds. See our
:manual:`troubleshooting </faq/diagnostics>` page for details.
Secure Instances
~~~~~~~~~~~~~~~~~~
Restrict access to your instances by using the Security Groups feature
within AWS. A Security Group is a set of firewall rules for incoming
packets that can apply to TCP, UDP or ICMP.
A common approach is to create a MongoDB security group that contains
the nodes of your cluster (replica set members or sharded cluster
members), followed by the creation of a separate security group for your
app servers or clients.
Create a rule in your MongoDB security group with the "source" field set
to the Security Group name containing your app servers and the port set
to 27017 (or whatever port you use for your MongoDB). This will ensure
that only your app servers have permission to connect to your MongoDB
instances.
Remember that Security Groups only control ingress traffic.
Communication Across Regions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Every EC2 instance will have a private IP address that can be used to
communicate within the EC2 network. It is also possible to assign a
public "elastic" IP to communicate with the servers from another
network. If using different EC2 regions, servers can only communicate
via public IPs.
To set up a cluster of servers that spans multiple regions, it is
recommended to name the server hostname to the "public dns name"
provided by EC2. This will ensure that servers from a different network
use the public IP, while the local servers use the private IP, thereby
saving costs. This is required since EC2 security groups are local to a
region.
To control communications between instances in different regions (for
example, if you have two members of a replica set in one region and a
third member in another), it is possible to use a built-in firewall
(such as IPtables on Linux) to restrict allowed traffic to certain
(elastic) IP addresses or ports.
For example one solution is following, on each server:
* set the hostname of the server
.. code-block:: sh
sudo hostname server1
* install "bind", it will serve as local resolver
* add a zone for your domain, say "myorg.com", and add the CNAMEs for all your servers
.. code-block:: sh
server1 IN CNAME ec2-50-19-237-42.compute-1.amazonaws.com.
server2 IN CNAME ec2-50-18-157-87.us-west-1.compute.amazonaws.com.
* restart bind and modify /etc/resolv.conf to use the local bind
.. code-block:: sh
search myorg.conf
nameserver 127.0.0.1
Then:
* verify that you can properly resolve server1, server2, ... using a
tool like dig.
* when running mongod, :method:`db.serverStatus()` should show the
correct hostname, e.g. "server1:27017".
* you can then set up replica sets or shards using the simple
hostname. For example connect to server1 and run
:samp:`rs.initiate()`, then :samp:`rs.add('server2:27017')`.
Something went wrong with that request. Please try again.