Skip to content
This repository

Work-around for HTTP GET limits #10

Merged
merged 1 commit into from about 2 years ago

3 participants

Grier Johnson Chris Uzelac Erik Bourget
Grier Johnson
Collaborator

This is a work-around for the size limit on HTTP GETs that range uses.
Very long requests fail and thus there's a certain type of query that
can't be made. This chane updates the library to automatically split
and make multiple requests when the size is larger than 7500 characters.
Generally apache is configured for 8190 characters, but with the fqdn
plus other bits of the header, this is a safer limit.

Let me know if it's too hacky... :)

Grier Johnson HTTP GET URL size limit workaround
THis is a work-around for the size limit on HTTP GETs that range uses.
Very long requests fail and thus there's a certain type of query that
can't be made.  This chane updates the library to automatically split
and make multiple requests when the size is larger than 7500 characters.
Generally apache is configured for 8190 characters, but with the fqdn
plus other bits of the header, this is a safer limit.
7b2f33a
Chris Uzelac
Collaborator

I agree that it's gross but until we use something other than GET, it's necessary.

Is there a way to future-proof this by only enabling it for GET requests?

Grier Johnson
Collaborator

The fix is only in the python library at this point, which just uses urllib2 to do a GET request. We'd have to change that part anyways if a different request method ever came into the picture. The extra requests could be removed at that point, or reserved for legacy GET requests at that time.

Erik Bourget erikwb merged commit ef1d7e9 into from April 01, 2012
Erik Bourget erikwb closed this April 01, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 1 unique commit by 1 author.

Mar 06, 2012
Grier Johnson HTTP GET URL size limit workaround
THis is a work-around for the size limit on HTTP GETs that range uses.
Very long requests fail and thus there's a certain type of query that
can't be made.  This chane updates the library to automatically split
and make multiple requests when the size is larger than 7500 characters.
Generally apache is configured for 8190 characters, but with the fqdn
plus other bits of the header, this is a safer limit.
7b2f33a
This page is out of date. Refresh to see the latest.

Showing 1 changed file with 68 additions and 1 deletion. Show diff stats Hide diff stats

  1. 69  python_seco_range/source/seco/range.py
69  python_seco_range/source/seco/range.py
@@ -17,14 +17,24 @@ def __str__(self):
17 17
         return repr(self.value)
18 18
 
19 19
 class Range(object):
20  
-    def __init__(self, host, user_agent=None):
  20
+    def __init__(self, host, user_agent=None, max_char=7500):
21 21
         self.host = host
  22
+        self.max_char = max_char
22 23
         self.headers = {}
23 24
         self.headers['User-Agent'] = self.get_user_agent(user_agent)
24 25
 
25 26
     def expand(self, expr, ret_list=True):
26 27
         if isinstance(expr, list):
27 28
                 expr = ','.join(expr)
  29
+
  30
+        # If the query is too large for a single query, send it off to
  31
+        # split functions
  32
+        if len(expr) > self.max_char:
  33
+            if ret_list:
  34
+                return self.split_query(expr, ret_list)
  35
+            else:
  36
+                return self.split_collapse(expr)
  37
+
28 38
         if ret_list:
29 39
             url = 'http://%s/range/list?%s' % (self.host, urllib2.quote(expr))
30 40
         else:
@@ -52,8 +62,65 @@ def expand(self, expr, ret_list=True):
52 62
             return req.read()
53 63
 
54 64
     def collapse(self, expr):
  65
+        '''
  66
+        Convenience function for returning collapsed format instead
  67
+        of an individual list
  68
+        '''
55 69
         return self.expand(expr, ret_list=False)
56 70
 
  71
+    def split_query(self, expr, ret_list):
  72
+        '''
  73
+        Range queries are GETs, which have a URL limit of 8190 on
  74
+        apache systems.    This method splits up long queries and
  75
+        makes multiple calls, merging the result into a list.
  76
+
  77
+        This is, admittedly, a total hack.    Should fix range to accept PUT
  78
+        for queries.
  79
+        '''
  80
+        final_list = []
  81
+        new_list = self.build_split_list(expr)
  82
+        for short_expr in new_list:
  83
+            final_list.append(self.expand(short_expr, ret_list=ret_list))
  84
+
  85
+        return final_list
  86
+
  87
+    def split_collapse(self, expr):
  88
+        '''
  89
+        Helper function for split collapses, since they may need to split
  90
+        and call multiple times to get the final collapsed list
  91
+        '''
  92
+        prev_expr = ''
  93
+        coll_expr = expr
  94
+        # Keep collapsing until the list stops changing
  95
+        while prev_expr != coll_expr:
  96
+            prev_expr = coll_expr
  97
+            coll_list = self.split_query(coll_expr, ret_list=False)
  98
+            coll_expr = (','.join(coll_list)).strip(',')
  99
+        return coll_expr
  100
+
  101
+    def build_split_list(self, expr):
  102
+        '''
  103
+        Take the max_char function and break up an expression list based on
  104
+        the character limits of individual items
  105
+        '''
  106
+        if isinstance(expr, str):
  107
+            expr = expr.split(',')
  108
+            expr.sort()
  109
+        new_list = []
  110
+        running_total = 0
  111
+        position = 0
  112
+        for range in expr:
  113
+            running_total += len(range) + 1
  114
+            if running_total > self.max_char:
  115
+                running_total = 0
  116
+                position += 1
  117
+            try:
  118
+                new_list[position].append(range)
  119
+            except (AttributeError, IndexError):
  120
+                new_list.append([range,])
  121
+
  122
+        return new_list
  123
+
57 124
     def get_user_agent(self, provided_agent):
58 125
         """
59 126
         Build a verbose User-Agent for sending to the range server.
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.