# Metadata

**L1 Taxonomy** - Backend Development

**L2 Taxonomy** - GraphQL APIs

**Subtopic** - Introduction to GraphQL APIs with Python

**Use Case** - Develop a simple GraphQL API using Python that will read and write data to a local JSON file. The API should be able to perform basic CRUD operations - Create, Read, Update, and Delete. The data structure can be a list of dictionaries, where each dictionary represents a user with fields: 'id', 'name', and 'email'. Use the 'graphene' library for creating the GraphQL schema and resolvers.

**Programming Language** - Python

**Target Model** - o1

# Setup

```requirements.txt
graphene==3.1.1
Flask==2.3.2
flask-graphql==2.0.1
```


# Prompt

Problem Description:
Implement a Flask‑based GraphQL API in Python that manages user records stored in a local JSON file. The API must support create, read, update and delete operations via GraphQL queries and mutations. Requests must be JSON‑only and validated accordingly. The system must enforce email uniqueness, atomic file writes, in‑memory caching with TTL, a Levenshtein‑based fuzzy search on name and email, bulk email updates, middleware logging, and expose a health‑check endpoint. A background file‑watcher thread must invalidate the cache on external JSON modifications.

- Input Format and Constraints:
All GraphQL operations are sent as POST to /graphql with Content‑Type: application/json. The JSON body must follow GraphQL request structure. No other content types are allowed. The USERS\_JSON\_PATH environment variable may override the default “users.json” file location. The cache TTL is fixed at 5 seconds. No external network calls are permitted.



- Expected Output Format:
Standard GraphQL JSON responses for queries and mutations. The /health endpoint returns JSON with “status” and “timestamp” fields. Errors (invalid JSON, duplicate email, missing user) must return appropriate GraphQL error messages or ValueError for non‑JSON requests.

- Examples:
Query all users with pagination and search:

```graphql
query {
  allUsers(limit: 10, offset: 0, search: "alice") {
    id name email createdAt updatedAt
  }
}
```

Query single user by ID:

```graphql
query {
  userById(id: "uuid‑string") {
    name email
  }
}
```

Create user mutation:

```graphql
mutation {
  createUser(userData: {name: "Bob", email: "bob@example.com"}) {
    ok message user { id name email }
  }
}
```

Update user mutation:

```graphql
mutation {
  updateUser(userData: {id: "uuid", name: "Bob Jr.", email: "bobjr@example.com"}) {
    ok message
  }
}
```

Delete user mutation:

```graphql
mutation {
  deleteUser(id: "uuid") { ok message }
}
```

Bulk email update:

```graphql
mutation {
  bulkUpdateEmails(updates: [
    {id: "uuid1", email: "a@x.com"},
    {id: "uuid2", email: "b@y.com"}
  ]) { ok updatedCount failed }
}
```

# Requirements



Requirements:
Explicit and Implicit Points:

* JSON‑only GraphQL endpoint with Content‑Type check
* CRUD via Graphene schema
* Email uniqueness validation
* Atomic write to JSON file
* In‑memory cache with 5s TTL
* Levenshtein fuzzy search controlled by maxDistance
* Bulk email updates
* Middleware logging of field names and args
* Health‑check route
* File‑watcher thread invalidates cache on file change

Solution Expectations:
The API must behave exactly as defined above. Atomic writes must never corrupt users.json. Cache must serve reads within TTL. Fuzzy and exact searches must filter correctly. Duplicate emails must be rejected. Bulk updates must report successes and failures.

Function Signatures:

```py
def _atomic_write(path: str, data: str) -> None
def load_users(force_reload: bool = False) -> List[Dict[str, Any]]
def save_users(users: List[Dict[str, Any]]) -> None
def find_user_index(user_list: List[Dict[str, Any]], user_id: str) -> int
def levenshtein(a: str, b: str) -> int
```

Class Definitions:

```py
class JSONOnlyGraphQLView(GraphQLView):
    def parse_body(self, request)

class UserType(graphene.ObjectType)
class UserInput(graphene.InputObjectType)
class Query(graphene.ObjectType)
class CreateUser(graphene.Mutation)
class UpdateUser(graphene.Mutation)
class DeleteUser(graphene.Mutation)
class BulkUpdateEmails(graphene.Mutation)
```

Edge Case Behavior:

* Corrupted JSON file resets to empty list
* Non‑JSON requests raise ValueError(“Only 'application/json' supported.”)
* Missing users return null in userById
* Duplicate email mutations return ok=False with message
* Fuzzy search with no matches returns empty list

Constraints:

* Do not use any libraries beyond the specified dependencies and standard library
* No use of reversed(), eval() or exec()
* File watcher must run as daemon thread
* Cache TTL fixed at 5 seconds; not configurable via API

Important Notes:
Assume input JSON is well‑formed GraphQL; only content‑type and business validations are required. On constraint violations, raise ValueError or return GraphQL errors as specified. Tests should not rely on timing longer than TTL + 1s.


In [1]:
# code

import os
import json
import uuid
import threading
import logging
import datetime
from typing import Any, Dict, List

import graphene
from flask import Flask
from flask_graphql import GraphQLView

USERS_FILE_PATH = os.environ.get('USERS_JSON_PATH', 'users.json')
_FILE_LOCK = threading.Lock()
_CACHE_TTL_SECONDS = 5
_cache: Dict[str, Any] = {'users': [], 'timestamp': None}

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


def _atomic_write(path: str, data: str) -> None:
    """Atomically write data to a file."""
    temp_path = f"{path}.{uuid.uuid4().hex}.tmp"
    with open(temp_path, 'w', encoding='utf-8') as f:
        f.write(data)
        f.flush()
        os.fsync(f.fileno())
    os.replace(temp_path, path)


def load_users(force_reload: bool = False) -> List[Dict[str, Any]]:
    now = datetime.datetime.utcnow()
    if (not force_reload and _cache['timestamp'] and
            (now - _cache['timestamp']).total_seconds() < _CACHE_TTL_SECONDS):
        return _cache['users']
    with _FILE_LOCK:
        if not os.path.exists(USERS_FILE_PATH):
            _atomic_write(USERS_FILE_PATH, '[]')
        with open(USERS_FILE_PATH, 'r', encoding='utf-8') as f:
            try:
                data = json.load(f)
            except json.JSONDecodeError:
                logger.warning("Corrupted JSON, resetting list.")
                data = []
        _cache['users'] = data
        _cache['timestamp'] = now
    return data


def save_users(users: List[Dict[str, Any]]) -> None:
    serialized = json.dumps(users, indent=2, default=str)
    with _FILE_LOCK:
        _atomic_write(USERS_FILE_PATH, serialized)
        _cache['timestamp'] = None


def find_user_index(user_list: List[Dict[str, Any]], user_id: str) -> int:
    for idx, u in enumerate(user_list):
        if u.get('id') == user_id:
            return idx
    return -1


def levenshtein(a: str, b: str) -> int:
    m, n = len(a), len(b)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(m + 1):
        dp[i][0] = i
    for j in range(n + 1):
        dp[0][j] = j
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            cost = 0 if a[i - 1] == b[j - 1] else 1
            dp[i][j] = min(
                dp[i - 1][j] + 1,
                dp[i][j - 1] + 1,
                dp[i - 1][j - 1] + cost
            )
    return dp[m][n]


class JSONOnlyGraphQLView(GraphQLView):
    def parse_body(self, request):
        if (not request.content_type or
                'application/json' not in request.content_type):
            raise ValueError(
                "Only 'application/json' supported."
            )
        return super().parse_body(request)


class UserType(graphene.ObjectType):
    id = graphene.ID(required=True)
    name = graphene.String(required=True)
    email = graphene.String(required=True)
    created_at = graphene.String()
    updated_at = graphene.String()


class UserInput(graphene.InputObjectType):
    id = graphene.ID()
    name = graphene.String(required=True)
    email = graphene.String(required=True)


class Query(graphene.ObjectType):
    all_users = graphene.List(
        UserType,
        limit=graphene.Int(),
        offset=graphene.Int(),
        search=graphene.String(),
        fuzzy=graphene.String(),
        max_distance=graphene.Int()
    )
    user_by_id = graphene.Field(UserType, id=graphene.ID(required=True))

    def resolve_all_users(self, info, limit=None, offset=None,
                          search=None, fuzzy=None, max_distance=2):
        users = load_users()
        if fuzzy:
            key = fuzzy.lower()
            users = [
                u for u in users
                if levenshtein(key, u['name'].lower()) <= max_distance
                or levenshtein(key, u['email'].lower()) <= max_distance
            ]
        elif search:
            term = search.lower()
            users = [
                u for u in users
                if term in u['name'].lower()
                or term in u['email'].lower()
            ]
        start = offset or 0
        end = start + limit if limit else None
        return [UserType(**u) for u in users[start:end]]

    def resolve_user_by_id(self, info, id: str):
        users = load_users()
        idx = find_user_index(users, id)
        return UserType(**users[idx]) if idx != -1 else None


class CreateUser(graphene.Mutation):
    class Arguments:
        user_data = UserInput(required=True)
    ok = graphene.Boolean()
    user = graphene.Field(UserType)
    message = graphene.String()

    def mutate(self, info, user_data):
        users = load_users(force_reload=True)
        if any(u['email'].lower() == user_data.email.lower()
               for u in users):
            return CreateUser(ok=False, message="Email exists.")
        now = datetime.datetime.utcnow().isoformat()
        new_user = {
            'id': str(uuid.uuid4()),
            'name': user_data.name.strip(),
            'email': user_data.email.strip(),
            'created_at': now,
            'updated_at': now
        }
        users.append(new_user)
        save_users(users)
        return CreateUser(ok=True, user=UserType(**new_user),
                          message="User created.")


class UpdateUser(graphene.Mutation):
    class Arguments:
        user_data = UserInput(required=True)
    ok = graphene.Boolean()
    user = graphene.Field(UserType)
    message = graphene.String()

    def mutate(self, info, user_data):
        users = load_users(force_reload=True)
        idx = find_user_index(users, user_data.id)
        if idx == -1:
            return UpdateUser(ok=False, message="Not found.")
        if any(u['email'].lower() == user_data.email.lower()
               and u['id'] != user_data.id for u in users):
            return UpdateUser(ok=False, message="Email in use.")
        now = datetime.datetime.utcnow().isoformat()
        users[idx].update({
            'name': user_data.name.strip(),
            'email': user_data.email.strip(),
            'updated_at': now
        })
        save_users(users)
        return UpdateUser(ok=True, user=UserType(**users[idx]),
                          message="User updated.")


class DeleteUser(graphene.Mutation):
    class Arguments:
        id = graphene.ID(required=True)
    ok = graphene.Boolean()
    message = graphene.String()

    def mutate(self, info, id):
        users = load_users(force_reload=True)
        idx = find_user_index(users, id)
        if idx == -1:
            return DeleteUser(ok=False, message="Not found.")
        deleted = users.pop(idx)
        save_users(users)
        return DeleteUser(ok=True, message=f"Deleted {deleted['name']}")


class BulkUpdateEmails(graphene.Mutation):
    class Arguments:
        updates = graphene.List(UserInput, required=True)
    ok = graphene.Boolean()
    updated_count = graphene.Int()
    failed = graphene.List(graphene.String)

    def mutate(self, info, updates):
        users = load_users(force_reload=True)
        failed_list: List[str] = []
        count = 0
        for inp in updates:
            idx = find_user_index(users, inp.id)
            if idx == -1:
                failed_list.append(f"{inp.id} not found")
                continue
            users[idx]['email'] = inp.email.strip()
            users[idx]['updated_at'] = (
                datetime.datetime.utcnow().isoformat()
            )
            count += 1
        save_users(users)
        return BulkUpdateEmails(ok=True, updated_count=count,
                                 failed=failed_list)


def logging_middleware(next_, root, info, **args):
    logger.info(f"GraphQL {info.field_name}, args={args}")
    try:
        return next_(root, info, **args)
    except Exception as e:
        logger.error(f"Error in {info.field_name}: {e}")
        raise


app = Flask(__name__)
schema = graphene.Schema(
    query=Query,
    mutation=graphene.ObjectType(
        create_user=CreateUser.Field(),
        update_user=UpdateUser.Field(),
        delete_user=DeleteUser.Field(),
        bulk_update_emails=BulkUpdateEmails.Field()
    )
)
app.add_url_rule(
    '/graphql',
    view_func=JSONOnlyGraphQLView.as_view(
        'graphql', schema=schema, graphiql=True,
        middleware=[logging_middleware]
    )
)

@app.route('/health', methods=['GET'])
def health_check():
    return {'status': 'ok',
            'timestamp': datetime.datetime.utcnow().isoformat()}


def start_file_watcher(interval_seconds: int = 10) -> None:
    def _watch():
        last_mod = (os.path.getmtime(USERS_FILE_PATH)
                    if os.path.exists(USERS_FILE_PATH) else None)
        while True:
            try:
                if os.path.exists(USERS_FILE_PATH):
                    new_mod = os.path.getmtime(USERS_FILE_PATH)
                    if last_mod and new_mod != last_mod:
                        logger.info("Cache invalidated.")
                        _cache['timestamp'] = None
                        last_mod = new_mod
                threading.Event().wait(interval_seconds)
            except Exception as e:
                logger.error(f"Watcher error: {e}")
                break
    watcher = threading.Thread(target=_watch, daemon=True)
    watcher.start()


if __name__ == '__main__':
    if not os.path.exists(USERS_FILE_PATH):
        _atomic_write(USERS_FILE_PATH, '[]')
    start_file_watcher()
    app.run(host='0.0.0.0', port=int(os.environ.get('PORT', 5000)))


ModuleNotFoundError: No module named 'graphene'

In [None]:
# tests
import os
import tempfile
import importlib
import json
import pytest

from flask import json as flask_json

@pytest.fixture(scope="module")
def client(monkeypatch):
    # create temp JSON file
    tmp = tempfile.NamedTemporaryFile(delete=False)
    tmp.close()
    monkeypatch.setenv('USERS_JSON_PATH', tmp.name)

    # reload main module to pick up env var
    import main
    importlib.reload(main)
    client = main.app.test_client()
    yield client, tmp.name
    os.unlink(tmp.name)

def test_health(client):
    client_app, _ = client
    rv = client_app.get('/health')
    assert rv.status_code == 200
    data = rv.get_json()
    assert data['status'] == 'ok'
    assert 'timestamp' in data

def test_rejects_non_json(client):
    client_app, _ = client
    rv = client_app.post(
        '/graphql',
        data='query{}',
        headers={'Content-Type': 'text/plain'}
    )
    assert rv.status_code != 200

def test_corrupted_json_resets(client):
    client_app, fname = client
    # write invalid JSON
    with open(fname, 'w') as f:
        f.write('not a json')
    # now query allUsers should succeed and return empty list
    qry = '{ allUsers { id name email } }'
    rv = client_app.post('/graphql', json={'query': qry})
    assert rv.status_code == 200
    data = rv.get_json()['data']['allUsers']
    assert data == []

def test_create_and_duplicate(client):
    client_app, _ = client
    create_q = '''
        mutation {
          createUser(userData: {name: "Alice", email: "alice@example.com"}) {
            ok message user { id name email }
          }
        }
    '''
    rv1 = client_app.post('/graphql', json={'query': create_q})
    assert rv1.status_code == 200
    res1 = rv1.get_json()['data']['createUser']
    assert res1['ok'] is True
    assert res1['user']['name'] == 'Alice'

    # duplicate email should fail
    rv2 = client_app.post('/graphql', json={'query': create_q})
    res2 = rv2.get_json()['data']['createUser']
    assert res2['ok'] is False
    assert 'Email exists' in res2['message']

def test_search_and_fuzzy(client):
    client_app, _ = client
    # create two users
    qs = [
        '''
        mutation {
          createUser(userData: {name: "Bob", email: "bob@example.com"}) {
            user { id name email }
          }
        }
        ''',
        '''
        mutation {
          createUser(userData: {name: "Bobby", email: "bobby@example.com"}) {
            user { id name email }
          }
        }
        '''
    ]
    for q in qs:
        client_app.post('/graphql', json={'query': q})

    # exact search
    rv = client_app.post('/graphql', json={'query': '{ allUsers(search: "Bob") { name } }'})
    names = [u['name'] for u in rv.get_json()['data']['allUsers']]
    assert "Bob" in names and "Bobby" in names

    # fuzzy search with maxDistance=1 for “Bbo” should match both
    rv2 = client_app.post('/graphql', json={'query': '{ allUsers(fuzzy: "Bbo", maxDistance: 1) { name } }'})
    names2 = [u['name'] for u in rv2.get_json()['data']['allUsers']]
    assert "Bob" in names2 and "Bobby" in names2

def test_update_user_and_not_found(client):
    client_app, _ = client
    # create a user
    rv = client_app.post('/graphql', json={'query': '''
        mutation {
          createUser(userData: {name: "Carol", email: "carol@example.com"}) {
            user { id }
          }
        }
    '''})
    user_id = rv.get_json()['data']['createUser']['user']['id']

    # update existing user
    upd = f'''
        mutation {{
          updateUser(userData: {{id: "{user_id}", name: "Caroline", email: "caroline@example.com"}}) {{
            ok user {{ name email }}
          }}
        }}
    '''
    rv2 = client_app.post('/graphql', json={'query': upd})
    data2 = rv2.get_json()['data']['updateUser']
    assert data2['ok'] is True
    assert data2['user']['name'] == 'Caroline'

    # update non-existent should fail
    rv3 = client_app.post('/graphql', json={'query': '''
        mutation {
          updateUser(userData: {id: "no-id", name: "X", email: "x@example.com"}) {
            ok message
          }
        }
    '''})
    res3 = rv3.get_json()['data']['updateUser']
    assert res3['ok'] is False

def test_delete_user_and_not_found(client):
    client_app, _ = client
    # create then delete
    rv = client_app.post('/graphql', json={'query': '''
        mutation {
          createUser(userData: {name: "Dave", email: "dave@example.com"}) {
            user { id }
          }
        }
    '''})
    user_id = rv.get_json()['data']['createUser']['user']['id']

    del_q = f'''
        mutation {{
          deleteUser(id: "{user_id}") {{ ok message }}
        }}
    '''
    rv2 = client_app.post('/graphql', json={'query': del_q})
    assert rv2.get_json()['data']['deleteUser']['ok'] is True

    # deleting again should fail
    rv3 = client_app.post('/graphql', json={'query': del_q})
    assert rv3.get_json()['data']['deleteUser']['ok'] is False

def test_bulk_update_and_file_watcher(client):
    client_app, fname = client
    # create two users
    ids = []
    for name in ("E1", "E2"):
        q = f'''
            mutation {{
              createUser(userData: {{name: "{name}", email: "{name.lower()}@x.com"}}) {{
                user {{ id }}
              }}
            }}
        '''
        rv = client_app.post('/graphql', json={'query': q})
        ids.append(rv.get_json()['data']['createUser']['user']['id'])

    # bulk update with one invalid
    updates = ','.join([
        f'{{id: "{ids[0]}", email: "n1@x.com"}}',
        f'{{id: "{ids[1]}", email: "n2@x.com"}}',
        '{id: "bad", email: "z@z.com"}'
    ])
    bunq = f'''
        mutation {{
          bulkUpdateEmails(updates: [{updates}]) {{
            ok updatedCount failed
          }}
        }}
    '''
    rv2 = client_app.post('/graphql', json={'query': bunq})
    bu = rv2.get_json()['data']['bulkUpdateEmails']
    assert bu['updatedCount'] == 2
    assert any('bad not found' in f for f in bu['failed'])

    # test file-watcher: externally modify file to empty JSON
    with open(fname, 'w') as f:
        f.write('[]')
    # wait longer than TTL to ensure watcher invalidates cache
    import time; time.sleep(6)
    rv3 = client_app.post('/graphql', json={'query': '{ allUsers { id } }'})
    assert rv3.status_code == 200
    assert rv3.get_json()['data']['allUsers'] == []


# Model Breaking Proof

#### Model Breaking Task URL: <Add the URL here>

#### Model code:

```python
# code generated by the model

# <Issue>: <Add the issue in the model code here>

# code generated by the model
```