Skip to content

Server Troubleshooting and Resolution

Omolara Adeboye edited this page Jul 30, 2024 · 7 revisions

Troubleshooting High CPU Usage in Linux

If Prometheus and Grafana indicate high CPU usage on your Linux system, follow these steps to investigate and resolve the issue:

  1. Identify CPU-intensive processes

Use the top command to view real-time system statistics and identify processes consuming high CPU:

top

Or use htop for a more user-friendly interface:

htop
  1. Analyze specific processes

For detailed information about a process, use:

ps aux | grep <process_name_or_PID>
  1. Check system load average View the system load average:
uptime
  1. Monitor CPU usage over time Use the sar command to collect, report, and save CPU usage data:
sudo sar -u 1 10

This command reports CPU usage every 1 second for 10 iterations.

  1. Examine CPU core usage To see CPU usage per core:
mpstat -P ALL 1 5
  1. Investigate high I/O wait times If I/O wait is high, use iostat to monitor disk I/O:
iostat -xz 1 10

Resolution steps:

  1. Terminate unnecessary processes:
kill <PID>

or force kill

kill -9 <PID>
  1. Adjust process priority:
renice +10 <PID>
  1. Limit CPU usage for a process:Use cgroups or the cpulimit tool
sudo cpulimit -p <PID> -l 50
  1. Update or optimize software: Keep your system and applications up-to-date:
sudo apt update && sudo apt upgrade
  1. Check for malware: Use tools like rkhunter or chkrootkit
sudo rkhunter --check
  1. Optimize system services:Disable unnecessary services
sudo systemctl disable <service_name>

Backup your system before making significant changes, and always test in a non-production environment first.

Troubleshooting and Resolving Low Memory Space in Linux

When your Linux system is running low on memory, follow these steps to diagnose and address the issue:

  1. Check Current Memory Usage

Use the free command to view memory statistics:

free -h

or a more detailed view, use:

cat /proc/meminfo
  1. Identify Memory-Intensive Processes: Use top or htop to see which processes are consuming the most memory
# Use top
top

# Use htop
htop

Sort processes by memory usage in top by pressing Shift+M.

  1. Analyze Specific Processes For detailed information about a process's memory usage:
ps aux | grep <process_name_or_PID>

To see the memory map of a process:

pmap -x <PID>
  1. Check for Memory Leaks Use Valgrind to check for memory leaks in a specific application:
valgrind --leak-check=full /path/to/your/program
  1. Monitor Swap Usage. Check swap space usage:
swapon --show
  1. Examine System Logs. Look for any memory-related errors in system logs:
sudo journalctl -p err..emerg

Resolution steps

  1. Terminate unnecessary processes:
kill <PID>

or force kill:

kill -9 <PID>
  1. Clear Page Cache: To free up cached memory
sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
  1. Increase Swap Space: Create a new swap file:
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

Add to /etc/fstab for persistence:

/swapfile none swap sw 0 0
  1. Optimize Applications:
  • Update software to latest versions
  • Configure applications to use less memory
  • Use lightweight alternatives for resource-heavy applications
  1. Implement Memory Limits:Use cgroups to set memory limits for services:
sudo systemctl set-property <service_name> MemoryLimit=1G
  1. Clean Up Disk Space:Remove unnecessary files and uninstall unused applications:
sudo apt autoremove
sudo apt clean
  1. Consider Hardware Upgrades: If issues persist, consider adding more RAM to your system.

Troubleshooting and Resolving Low Disk Space on a Linux Server

Low disk space on a Linux server can cause various issues, including application crashes and system instability. This guide provides steps and commands to troubleshoot and resolve low disk space issues.

  1. Check Disk Usage

Use the df command to check disk usage of all mounted filesystems.

df -h
  1. Identify Large Files and Directories: Use the du command to identify large files and directories
du -sh /path/to/directory/*

Find Top 10 Largest Directories in Root

du -ahx / | sort -rh | head -10
  1. Clean Up Unnecessary Files
  • Remove Unnecessary Packages
sudo apt-get autoremove
sudo apt-get clean
  • Clear Systemd Journal Logs
sudo journalctl --vacuum-size=100M
  • Clear APT Cache (Debian/Ubuntu)
sudo apt-get clean
  • Delete Old Logs
sudo find /var/log -type f -name "*.log" -exec rm -f {} \;
  1. Investigate and Clear Docker Disk Usage If you are using Docker, it can consume a significant amount of disk space.
  • Check Docker Disk Usage
sudo docker system df
  • Remove unused Docker data
sudo docker system prune -a

# or force Remove
sudo docker system prune -af
  1. Implement log rotation using tools like logrotate to prevent log files from consuming too much disk space.

  2. Consider adding more disk space or storage to the server if disk space issues persist.

Server Troubleshooting And Resolution Guide

Network Traffic Issues

Symptoms

  • Slow response times
  • High latency
  • Unexpected bandwidth usage

Troubleshooting Steps

  1. Check network utilization: iftop -i <interface>
  2. Analyze network connections: netstat -tuln
  3. Monitor incoming/outgoing traffic: tcpdump -i <interface> -n

Resolution

  • Optimize application code for network efficiency
  • Implement caching mechanisms
  • Consider load balancing or CDN solutions

Network Errors

Symptoms

  • Connection timeouts
  • DNS resolution failures
  • SSL/TLS errors

Troubleshooting Steps

  1. Check DNS resolution: nslookup <domain>
  2. Test network connectivity: ping <host> traceroute <host>
  3. Verify SSL/TLS configuration: openssl s_client -connect <host>:<port>

Resolution

  • Update DNS settings
  • Check firewall rules
  • Renew or reconfigure SSL/TLS certificates

Disk I/O Issues

Symptoms

  • High disk usage
  • Slow read/write operations
  • I/O wait time spikes

Troubleshooting Steps

  1. Monitor disk I/O: iostat -x 1
  2. Check disk usage: df -h du -sh /*
  3. Identify processes causing high I/O: iotop

Resolution

  • Optimize database queries
  • Implement proper indexing
  • Consider upgrading to SSDs or faster storage
  • Adjust file system parameters (e.g., noatime mount option)

General Tips

  • Always backup data before making significant changes
  • Keep system and application logs for reference
  • Regularly update and patch your systems
  • Monitor server performance consistently to catch issues early

Clone this wiki locally