Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot connect to RDS instance in the same VPC #343

Closed
FranzBusch opened this issue May 7, 2021 · 4 comments
Closed

Cannot connect to RDS instance in the same VPC #343

FranzBusch opened this issue May 7, 2021 · 4 comments
Assignees
Labels
Bug Something isn't working

Comments

@FranzBusch
Copy link

FranzBusch commented May 7, 2021

Summary
I cannot connect from my ECS Task to an RDS instance inside the same VPC.

Steps to Reproduce
I use the howto-grpc-ingress-gateway example as a foundation and added and deployed an RC instance in the same VPC using the following cloud formation stack:

rds.yaml
Parameters:
  ProjectName:
    Type: String
    Description: Project name to link stacks

Resources:
  Database:
    Type: AWS::RDS::DBInstance
    Properties:
      VPCSecurityGroups:
      - Ref: SecurityGroup
      AllocatedStorage: '5'
      DBInstanceClass: db.t2.micro
      Engine: postgres
      MasterUsername: Username
      MasterUserPassword: Password
      DBSubnetGroupName: !Ref SubnetGroup
    DeletionPolicy: Snapshot

  SubnetGroup:
    Type: "AWS::RDS::DBSubnetGroup"
    Properties:
      DBSubnetGroupDescription: !Sub '${ProjectName}-database-subnet-group'
      SubnetIds:
        - Fn::ImportValue: !Sub '${ProjectName}:PrivateSubnet1'
        - Fn::ImportValue: !Sub '${ProjectName}:PrivateSubnet2'

  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
        GroupDescription: !Sub '${ProjectName}-database'
        SecurityGroupIngress:
        - SourceSecurityGroupId: 
            Fn::ImportValue:
              !Sub '${ProjectName}:ServiceSecurityGroup'
          IpProtocol: tcp
          ToPort: 5432
          FromPort: 5432
        - CidrIp: 0.0.0.0/0
          IpProtocol: tcp
          ToPort: 65535
          FromPort: 0
        VpcId: 
            Fn::ImportValue:
              !Sub '${ProjectName}:VPC'

## App Mesh
  VirtualNode:
    Type: AWS::AppMesh::VirtualNode
    Properties:
      MeshName: 
        Fn::ImportValue:
          !Sub '${ProjectName}:AppMeshName'
      VirtualNodeName: Database
      Spec:
        Listeners:
        - PortMapping:
            Port: 5432
            Protocol: tcp
        ServiceDiscovery:
          DNS:
            Hostname: !GetAtt Database.Endpoint.Address

  VirtualService:
    DependsOn:
    - VirtualNode
    Type: AWS::AppMesh::VirtualService
    Properties:
      MeshName: 
        Fn::ImportValue:
          !Sub '${ProjectName}:AppMeshName'
      VirtualServiceName: !Sub 'database.${ProjectName}.local'
      Spec:
        Provider:
          VirtualNode:
            VirtualNodeName: Database

Outputs:
    DatabaseEndpointUrl:
        Description: Database Endpoint URL
        Value: !GetAtt Database.Endpoint.Address

Inside my color-server main.go I adapted the GetColor handler to try to connect to the RDS instance:

func (s *colorServer) GetColor(ctx context.Context, in *pb.GetColorRequest) (*pb.GetColorResponse, error) {
	log.Printf("Received GetColor request")

	dsn := "host=databaseProjectName.local user=Username password=Password dbname=postgres port=5432"
	database, error := gorm.Open(postgres.Open(dsn), &gorm.Config{})
	if error != nil {
		log.Print(error)
	}
	database.AutoMigrate(&Credentials{})

	// test for random flakiness in the api
	if rand.Float32() < s.flakiness.Rate {
		code := codes.Code(s.flakiness.Code)
		return nil, status.Error(code, code.String())
	}
	return &pb.GetColorResponse{Color: s.color}, nil
}

When calling the gRPC endpoint I get a log saying it cannot resolve the hostname:

2021/05/07 08:11:34 �[31;1m/go/pkg/mod/gorm.io/driver/postgres@v1.1.0/migrator.go:157 �[35;1mfailed to connect to `host=database.ProjectName.local user=MyName database=postgres`: hostname resolving error (lookup database.ProjectName.local on 10.0.0.2:53: no such host)

I also tried using the RDS endpoint directly instead of the name of the VirtualService, when doing that I get a different error that the tcp connection was reset by the peer:

 �[31;1m/go/pkg/mod/gorm.io/driver/postgres@v1.1.0/migrator.go:157 �[35;1mfailed to connect to `host=zd9p9g6uapgjp6.cz8psbxmibbv.eu-central-1.rds.amazonaws.com user=Username database=postgresql`: failed to receive message (read tcp 10.0.113.47:36956->10.0.85.217:5432: read: connection reset by peer)

I think these are two different problems one is where the DNS resolution doesn't work and was under the impression that I should be able to connect to the database using the VirtualServiceName. The second error looks like the Envoy proxy is reseting the DB connection.

Are you currently working around this issue?
I was able to work around the TCP issue by using the EgressIgnoredPorts setting for my task definition. Is that actually the expected way to make this work?

Additional context
Both the RDS and the ECS tasks are in the same VPC. The RDS has a security group that allows inbound traffic on port 5432 from the security group of the tasks.

@FranzBusch FranzBusch added the Bug Something isn't working label May 7, 2021
@marceloboeira
Copy link

It seems related to #62, please take a look at that thread and also the documentation regarding connectivity troubleshooting

@FranzBusch
Copy link
Author

@marceloboeira Thanks for your reply. I read through all the documentation. Sadly, I don't see a potential solution in there.
I am using the latest envoy image in my task definition and it still only works with the EgressIgnoredPorts options.

On the other hand, regarding the DNS name resolution issue that is probably related to the #65.

@herrhound herrhound added this to Awaiting Triage in aws-app-mesh-known-issues May 13, 2021
@herrhound
Copy link

Maybe related to #270

@suniltheta
Copy link

Closing this issue in favor of #270 & #65

@herrhound herrhound removed this from Awaiting Triage in aws-app-mesh-known-issues Aug 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants