Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index error when using PostGres #279

Open
DanRamage opened this issue Oct 29, 2014 · 1 comment
Open

Index error when using PostGres #279

DanRamage opened this issue Oct 29, 2014 · 1 comment
Labels

Comments

@DanRamage
Copy link

Although this could feasibly happen on any indexed field, when running a harvest against the NDBC SOS service, I ran into an exception of:

<ows:Exception exceptionCode="NoApplicableCode" locator="source">
<ows:ExceptionText>Harvest (insert) failed: ERROR: index row size
3956
exceeds maximum 2712 for index "ix_records_operateson"
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text
indexing.
</ows:ExceptionText>
</ows:Exception>
</ows:ExceptionReport>

My source xml to direct the CSW to harvest from the SOS:

<?xml version="1.0" encoding="UTF-8"?>
<Harvest xmlns="http://www.opengis.net/cat/csw/2.0.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2
http://schemas.opengis.net/csw/2.0.2/CSW-publication.xsd" service="CSW"
version="2.0.2">
<Source>http://sdf.ndbc.noaa.gov/sos/server.php</Source>
<ResourceType>http://www.opengis.net/sos/1.0</ResourceType>
</Harvest>

The "operateson" field was extremely large from the SOS server, it was an entry of all the stations being served out from the service.
Tom suggested commenting out line 678 in metadata.py as a temporary workaround.

@DanRamage
Copy link
Author

Another tangent out of memory issue I've just run into is when there are records in the pycsw repository, I re-ran the harvesting at the same endpoint and the pycsw server threw a MemoryException.

  File "/usr/local/src/python/lib/python2.7/wsgiref/handlers.py", line 85, in run
    self.result = application(self.environ, self.start_response)
  File "csw_staging.wsgi", line 100, in application
    contents = csw.dispatch_wsgi()
  File "/home/madrona/src/pycsw/pycsw/server.py", line 391, in dispatch_wsgi
    return self.dispatch()
  File "/home/madrona/src/pycsw/pycsw/server.py", line 550, in dispatch
    self.response = self.harvest()
  File "/home/madrona/src/pycsw/pycsw/server.py", line 1782, in harvest
    results = self.repository.query_source(content)
  File "/home/madrona/src/pycsw/pycsw/repository.py", line 239, in query_source
    return self._get_repo_filter(query).all()

I was using "free -m" on the command line and could see the memory being used until it could no longer allocate any.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants